Avatar

Joseph Chee Chang

Technical HCI Research・Semantic Scholar・Allen Institute for AI

I am a research scientist at AI2. Previously, I obtained a PhD degree from the Language Technologies Institute at CMU specializing in Human-Computer Interaction, Sensemaking, Crowdsourcing, and Applied Machine Learning. I was advised by Aniket Kittur, and my research was supported by Google, Bosch, Yahoo, the ONR, and the NSF.

My research focus on developing information systems to support users and crowdworkers to explore and make sense of large amounts of information and make better decisions. For example, using crowds to synthesize search results into coherent articles or empowering consumers to navigate and explore thousands of reviews and online sources and gain deep insights and make confident decisions.

Solvent

A Mixed Initiative System for Finding Analogies between Research Papers

Analogies in distant domains often lead to scientific discoveries. However, it can be prohibitively difficult for researchers to find useful analogies from unfamiliar domains as search engines poorly support it. We introduce Solvent, a mixed-initiative system where annotators structure abstracts of academic papers into different aspects and use a semantic model to find analogies among research papers and across different domains. These results demonstrate a new path towards computationally supported knowledge sharing in research communities.

Joel Chan, Joseph Chee Chang, Tom Hope, Dafna Shahaf, Aniket Kittur.
ACM CSCW 2018 (r=27% N=385)

- Analogy, CSCW, Crowdsourcing, HCI, Information Retrieval, Machine Learning, Search, Sensemaking

Evorus

Crowd-powered Conversational Assistant Built to Automate Itself Over Time

BEST PAPER NOMINATION
Crowd-powered chatbots are robust than current pure AI approach, but can be slower and more expensive at runtime. We attempted to combine the two approaches for high quality, low latency, and low cost. We introduce Evorus, a crowd-powered chatbot that automate itself over time by learning to integrate AI chatbots, reusing responses, and assess response quality. A 5-month-long public deployment study shows promising results. You can try talking to Evorus today.

Kenneth Huang, Joseph Chee Chang, Jeff Bigham.
ACM SIGCHI 2018 (r=26% N=2595)

- Best Papers, CHI, Crowdsourcing, HCI, Machine Learning, SIGCHI

Revolt

Collaborative Crowdsourcing for Labeling Machine Learning Datasets

Generating comprehensive labeling guidelines for crowdworkers can be challenging for complex datasets. Revolt harnesses crowd disagreements to identify ambiguous concepts in the data and coordinates the crowd to collaboratively create rich structures for requesters to make posthoc decisions, removing the need for comprehensive guidelines and enabling dynamic label boundaries.

Work done during internship at Microsoft Research, Redmond.

Joseph Chee Chang, Saleema Amershi, Ece Kamar.
ACM SIGCHI 2017 (r=25% N=2424)

- CHI, Classification, Crowdsourcing, HCI, Labeling, Machine Learning, SIGCHI, Sensemaking

Alloy

Clustering with Crowds and Computation

BEST PAPER NOMINATION
HCOMP 2016 INVITED ENCORE TALK
Many crowd clustering approaches have difficulties providing global context to workers in order to generate meaningful categories. Alloy uses a sample-and-search technique to provide a better understanding of the global context. It also combines the in-depth semantic knowledge from human computation and the scalability of machine learning models to create rich structures from unorganized documents with high quality and efficiency.

Joseph Chee Chang, Aniket Kittur, Nathan Hahn.
ACM SIGCHI 2016 (r=23% N=2435)

- Best Papers, Crowdsourcing, HCI, Information Synthesis, Machine Learning, SIGCHI, Sensemaking

The Knowledge Accelorator

Big Picture Thinking in Small Pieces

BEST PAPER NOMINATION
Answering complex questions such as “How do I grow better tomatoes?” often requires individuals to conduct extensive online research and synthesis. Can we crowdsource this complex, high context process with 100 crowdworkers conducting microtasks distributedly? The Knowledge Accelerator uses crowdworkers to extract and synthesize text clips across web pages into coherent articles without a centralized coordinator.

Nathan Hahn, Joseph Chee Chang, Aniket Kittur.
ACM SIGCHI 2016 (r=23% N=2435)

- Best Papers, Crowdsourcing, HCI, Information Foraging, Information Retrieval, Information Synthesis, SIGCHI, Sensemaking

Twitter Code-Switching

Recurrent-Neural-Network for Language Detection on Twitter Code-Switching Corpus

Code-switching behavior is a common phenomenon on social media to express solidarity or establish authority. While past work on automatic code-switching detection depends on dictionary look-up or named-entity recognition, our recurrent neural network model that relies on only raw features outperformed the top systems in the EMNLP’14 Code-Switching Workshop by 17% in error rate reduction.

Final project for the Deep Learning course at CMU.

Joseph Chee Chang, Chu-Cheng Lin.
arXiv (course final project)

- Code-Switching, Deep Learning, Machine Learning, NLP, Neural Network, arXiv, pre-print

TermMine

Learning to Find Translations and Transliterations on the Web

TermMine is an information extraction system that can automatically mine translation pairs of terms from the web. We used a small set of terms and translations to gather mixed-code text from the web to train a CRF model that can identify translation pairs at run-time.

Joseph Chee Chang, Jason S. Chang, Roger Jang.
ACL 2012 (r=21% N=369)

- ACL, Information Extraction, Machine Learning, NLP, Translation

WikiSense

Supersense Tagging Named Entities on Wikipedia

We introduced a method for classifying named-entities into broad semantic categories in WordNet. We extracted rich features from Wikipedia, allowing us to classify named-entities with high precision and coverage. The result is a large scale named-entity semantic database with 1.2 million entries and over 95% accuracy, covering 80% of all named-entities found on Wikipedia.

Joseph Chee Chang, Richard Tsai, Jason S. Chang.
PACLIC 2009

- Information Extraction, Machine Learning, NLP, PACLIC