Value Compass

Abstract:

Large Language Models (LLMs) have achieved remarkable breakthroughs, yet their growing integration into humans’ everyday life raises important societal concerns. Incorporating diverse human values into these powerful generative models is critical for enhancing AI safety, respecting cultural and individual values, and potentially boosting the productivity and innovation of future human–AI hybrid collectives. This project adopts an interdisciplinary approach, integrating AI research with philosophical, psychological, and social science perspectives on values, ethics and cultures. We focus on three fundamental research questions: RQ1: What values does AI have? We evaluate the value orientations and examine how the internal values of generative models influence their behaviour. RQ2: What values should AI adopt? We investigate whether LLMs exhibit stable value structures and which values best reduce harm and enhance user satisfaction. RQ3: How can AI be aligned with diverse and evolving human values across different cultural contexts? We aim to ensure alignment as models grow more capable and societal norms continue to shift. Through these efforts, we are developing systematic alignment frameworks that considers the clarify, adaptability, and transparency requirements. Our ultimate goal is to help build a symbiotic future in which humans and AI coexist, collaborate productively, and finally co-evolve.

Representative Publications:

Open-source contributions:

FULCRA: An Alignment dataset grounded in Schwartz Theory of Basic Human Values (opens in new tab)

Other achievements:

Value Compass Benchmarks: A comprehensive, self-evolving, and pluralistic evaluation platform for LLM values, grounded in basic values and generative, evolving assessments. (opens in new tab)

More Information: Value Compass Homepage (opens in new tab)