SLIC AI is an exascale ML tool ready to analyze and extract unique information from various text documents, such as, scientific publications, e-mails, reports, and their metadata. SLIC AI has also an interactive regime, allowing human (domain expert) on the loop, for a better accuracy and specificity. SLIC AI has the following capabilities:
- performs robust unsupervised learning (it does not require training or labeled data) of arbitrary text corpus and extract the topics and subtopics in a hierarchical manner, while considering the semantic of the text.
- uses a LANL patent to determine the number of topics, which is vital for explainability.
- is an HPC tool that can analyze exascale data (sparse or dense) with a unique scaling on heterogenous CPU/GPUs clusters.
- can build unique and specific corpora and knowledge graphs through SMEs interactions human on the loop).
- can rank authors/institutions (based on their research on a specific topic), using their network interactions - e.g., co-authoring, co-citations data, etc.
- can determines the roles of the authors, such as, brain, working bee, mediator, and others, based on graph centrality.
- can build and analyze scientific ecosystems of a) country, or b) institution, or c) group of authors.
- can build a temporal, topic specific, authors profile, which includes their social scientific interactions (such as, citation and co-authors networks), as well as the evolution of their professional affiliations.
- can determine changes/evolution of a specific technology trend of interest, related to a country, or institution, or a group of authors.
Papers
- Interactive Distillation of Large Single-Topic Corpora of Scientific Papers.,
N. Solovyev, R. Barron, M. Bhattarai, M. Eren, K. O. Rasmussen, ...
arXiv preprint arXiv:2309.10772: 2023. - Senmfk-split: Large corpora topic modeling by semantic non-negative matrix factorization with automatic model selection.,
M. E. Eren, N. Solovyev, M. Bhattarai, K. Ø. Rasmussen, C. Nicholas, ...
Proceedings of the 22nd ACM Symposium on Document Engineering: 2022. - Finding the number of latent topics with semantic non-negative matrix factorization.,
R. Vangara, M. Bhattarai, E. Skau, G. Chennupati, H. Djidjev, T. Tierney, ...
IEEE Access 9: 2021. - COVID-19 multidimensional Kaggle literature organization.,
M. Eren, N. Solovyev, C. Hamer, R. McDonald, B.S. Alexandrov, and C. Nicholas.
Proceedings of the 21st ACM Symposium on Document Engineering, pp. 1-4. 2021. - Semantic Nonnegative Matrix Factorization with Automatic Model Determination for Topic Modeling,
R. Vangara, E. Skau, G. Chennupati, et al.
Proceedings of 19th IEEE International Conference on Machine Learning and Applications, December 14-17, 2020.