Abstract:
In this article, we propose an opensource toolkit to extract, parse,
and analyze the Wikipedia talk pages. The core parser uses a
tree-based approach to parse the unstructured comments and a
JSON(JavaScript Object Notation) structure to store them in a
NoSQL(not only SQL) database. User-friendly and high-level analysis methods are created on the top of NoSQL database, which can
be used to understand the collaboration dynamics on article talk
pages.
CCS CONCEPTS
• Information systems → Specialized information retrieval;
• Human-centered computing → Wikis.