TACIT's Corpus Management tool aims to provide a simple user interface to provide an easy and efficient way to manage different types of corpora (or sets of documents/text)
The Reddit Crawler provides full access to all data available on Reddit.com
The Twitter Crawler tool collects text from the Twitter website and writes that data into text files that are readable by automated text analysis programs
The US Congress Crawler collects speech transcription data from the Library of Congress THOMAS Website for present day speeches to as far back as the 101st Congress
The Latin Crawler tool collects text from The Latin Library Website and writes that data into text files that are readable by automated text analysis programs
The StackExchange Crawler provides full access to all data available on stackexchange.com, a question and answer website covering topics in various fields
The UC Santa Barbara Presidential Papers Crawler collects text from The American Presidency Project Website and writes that data into text files that are readable by automated text analysis programs
The Hansard United Kingdom Parliament Crawler collects text from The Hansard Parliament Website, which contains the official report of all parliamentary debates
he PLOS Crawler tool collects text from the PLOS ONE journal website
The Typepad Blog Crawler tool collects text from the Typepad website
The Frontiers Journal Crawler tool collects text from the Frontiers in Psychology website, a site that provides an Open Science platform and publishes many widely known open-access journals
Co-occurrence analysis gives insights into the interconnection between terms or entities within any given text
The TACIT Word Count plugin was developed exclusively for the TACIT tool as a more comprehensive approach to word count techniques
The LIWC-style word count plugin was developed using LIWC documentation and reverse engineering of the program to understand and implement the algorithm
The Hierarchical Clustering tool allows you to visualize the clusters in a dendrogram, which presents a hierarchical view of the clusters (from the input dataset) and their relations
The K-means clustering plugin aims to cluster texts into a user-specified number of groups such that the texts included in each cluster are the nearest to the cluster's centroid, and have the farthest distance from other clusters' centroids
This tool performs Naive Bayes classification which is a supervised learning algorithm that is used to assess the uniqueness of text files belonging to user-determined set of two or more distinct groups
TACIT's SVM classification plugin can be used to investigate the degree to which two classes can be separated based on the provided data
TACIT's LDA plugin allows you to explore the structure of topics within a set of documents and the relations between them.
TACIT's seeded z-label LDA plugin expands upon the capabilities of LDA by implementing the z-label LDA algorithm
Performs hierarchical Dirichlet process (HDP), a nonparametric Bayesian model for clustering problems involving multiple groups of data.
Performs online variational Bayes (VB) algorithm for Latent Dirichlet Allocation (LDA).
Performs supervised latent Dirichlet allocation (sLDA), a statistical model of labelled documents.
TurboTopics plugin finds significant multiword phrases in topics.
HLDA plugin implements a topic model that finds a hierarchy of topics. The structure of the hierarchy is determined by the data.