STEM Speaker Series: “Representing Knowledge through Word and Graph Embeddings” with Dr. Steven Skiena

Posted on February 11, 2020 by Clara Tran

Date: 02/11/2020

Time: 1:00 pm - 2:00 pm

Location
Special Collections Seminar Room

Description

Abstract

Libraries are about representing large quantities of knowledge to make them broadly useful and available. Similarly, word and graph embeddings (e.g. word2vec) provide powerful ways to reduce large text corpora to concise features readily applicable to a variety of problems in NLP and data science. I will introduce word embeddings, and apply them in variety of new and interesting directions, including:

(1) Multilingual NLP — The Polyglot project (www.polyglot-NLP.com) employs deep learning and other techniques to build a basic NLP pipeline (including entity recognition, POS tagging, and sentiment analysis) for over 100 different languages. We train our systems over each language’s Wikipedia edition, providing unified data resources in the absence of explicitly annotated data, but substantial challenges in interpretation and evaluation.

(2) Detecting Historical Shifts in Word Meaning — Words like “gay” and “mouse” have substantially shifted their meanings over time in response to societal and technological changes. We use word embeddings trained over texts drawn from different time periods to detect changes in word meanings. This is part of our efforts in historical trends analysis.

(3) Feature Extraction from Graphs — We present DeepWalk, our approach for learning latent representations of vertices in a network, which has become extremely popular. DeepWalk uses local information on truncated random walks to learn embeddings, by treating walks as the equivalent of sentences in a language. It is suitable for a broad class of applications such as network classification and anomaly detection. We also introduce new graph embedding techniques based on random projections, which produce DeepWalk-quality embeddings thousands of times faster than previous algorithms.

================

Biography

Steven Skiena is Distinguished Teaching Professor of Computer Science and Director of the Institute for AI-Driven Discovery and Innovation at Stony Brook University. His research interests include data science, bioinformatics, and algorithms. He is the author of six books, including “The Algorithm Design Manual”, “The Data Science Design Manual”, and “Who’s Bigger: Where Historical Figures Really Rank”.

Skiena received his Ph.D. in Computer Science from the University of Illinois in 1988. He is the author of over 150 technical papers. He is a Fellow of the American Association for the Advancement of Science (AAAS), a former Fulbright scholar, and recipient of the ONR Young Investigator Award and the IEEE Computer Science and Engineer Teaching Award. More info is available at http://www.cs.stonybrook.edu/~skiena/.

Registration

Bookings are closed for this event.

About
Latest Posts

Clara Tran

Head, Science and Engineering at Stony Brook University Libraries

Clara is the member of the Library STEM Team.
Email: clara.tran@stonybrook.edu

Latest posts by Clara Tran (see all)

2026 Spring Semester: Reference and Virtual Chat Services - January 23, 2026
Honoring Sherry Chang’s 57 Years of Library Service at Stony Brook University - January 16, 2026
2025 Fall Semester: Reference and Virtual Chat Services - August 22, 2025

Posted in Sciences Events

View All Libraries & Hours

STEM Speaker Series: “Representing Knowledge through Word and Graph Embeddings” with Dr. Steven Skiena

Abstract

Biography

Registration

Clara Tran

Latest posts by Clara Tran (see all)

LIBRARY

RESOURCES

FOLLOW US