Natural language processing (NLP) is an artificial intelligence (AI) and computational linguistics subfield where natural languages are processed by computers. The increase in the data and number of applications enables the research on this subject to gain importance rapidly. Recently, NLP has become prominent in promoting the use of technology in areas such as social sciences as well as in technical fields.
In the meantime, the ever-evolving artificial intelligence-based NLP practices present various ethical, explanatory, transparency and interpretability problems. In other words, we need to know why and how AI-based algorithms, not just specific to NLP, achieve the results they obtain.
By nature, NLP deals with natural language texts created by people. In NLP, the properties of the natural language must be expressed mathematically in a digital environment in order to transfer the texts to a format that computers can process. Word representations are the representation of words in the language in a multidimensional space. Word representations play a major role in facilitating NLP applications more easily and quickly by acting as a building block. Interpretation of word representations, which are one of the most frequently used techniques in NLP, and transparency against prejudices (biases) encoded in them are issues of great importance.
As they are used as critical building blocks, it is an increasingly important research area whether word representations contain interpretability, explainability and various prejudices.
In our projects, we try to eliminate the ethical concerns of artificial intelligence applications that have come into our lives by developing innovative methods in terms of interpretability, and explainability for word representations.
Senel, L. K., Utlu, I., F. Sahinuc, H. M. Ozaktas, and A. Koç, 2020. “Imparting Interpretability to Word Embeddings while Preserving Semantic Structure”, Natural Language Engineering, Cambridge University Press, pp. 1-26.
Senel, L. K., Utlu, I., Yucesoy, V., Koç, A., and Çukur, T. 2018a. “Semantic Structure and Interpretability of Word Embeddings,” IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 26, no. 10, pp. 1769-1779.
Yucesoy, V., and Koç, A., 2019. “Co-occurrence Weight Selection in Generation of Word Embeddings for Low Resource Languages,” ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 18, no. 3, Article 22.
Senel, L. K., Yucesoy, V., Koç, A., and Çukur, T., 2017. “Measuring cross-lingual semantic similarity across European languages”, 40th International Conference on Telecommunications and Signal Processing (TSP), Barcelona, Spain.
Senel, L. K., Utlu, I., Yucesoy, V., Koç, A., and Çukur, T. 2018b. “Generating Semantic Similarity Atlas for Natural Languages”, IEEE Workshop on Spoken Language Technologies, Athens, Greece.