[{"data":1,"prerenderedAt":369},["ShallowReactive",2],{"content-query-Zr7Ta05R4f":3,"content-navigation-8C37fagqQL":183},{"_path":4,"_dir":5,"_draft":6,"_partial":6,"_locale":7,"title":8,"description":9,"heading":10,"abstract":11,"year":12,"tags":13,"schemaOrg":16,"body":70,"_type":152,"_id":153,"_source":154,"_file":155,"_stem":156,"_extension":157,"head":158},"/research/invisible-coalition-partner","research",false,"","The Invisible Coalition Partner: How LLMs Vote When Democracy Gets Concrete","Dual-instrument study comparing 66 LLMs on abstract Swiss Smartvote questionnaire and 48 real federal referenda in four languages, revealing that the established leftward bias does not generalize to concrete policy decisions.","LLM Political Bias","Introduced dual-instrument methodology showing the established leftward LLM bias is instrument-dependent: on abstract questionnaires, 66 models converge center-left; on 48 real Swiss referenda, the agreement gradient flips to center-peaked, with language shifting answers more than political content for some models.","2026",[14,15],"arXiv Preprint","Independent Research",[17,41],{"@context":18,"@type":19,"headline":8,"description":20,"datePublished":21,"dateModified":21,"author":22,"inLanguage":37,"keywords":38,"articleSection":39,"isAccessibleForFree":40},"https://schema.org","ScholarlyArticle","Dual-instrument study comparing LLM political bias on abstract questionnaires versus concrete policy decisions using Swiss democratic instruments, revealing instrument-dependent bias and cross-linguistic instability.","2026-04-01",[23],{"@type":24,"givenName":25,"familyName":26,"name":27,"identifier":28,"url":29,"affiliation":30},"Person","Joel","Barmettler","Joel P. Barmettler","0009-0006-5118-7129","https://joelbarmettler.xyz",{"@type":31,"name":32,"address":33},"Organization","Independent Researcher",{"@type":34,"addressLocality":35,"addressCountry":36},"PostalAddress","Zurich","Switzerland","en","LLM Political Bias, Large Language Models, Swiss Democracy, Smartvote, Volksabstimmungen, Multilingual AI, Political Compass, RLHF, Direct Democracy","Artificial Intelligence, Political Science, Computational Social Science",true,{"@context":18,"@type":42,"mainEntity":43},"FAQPage",[44,50,54,58,62,66],{"@type":45,"name":46,"acceptedAnswer":47},"Question","Do large language models have a political bias?",{"@type":48,"text":49},"Answer","On abstract political questionnaires, yes: 66 LLMs from 27 model families converge on a center-left position, replicating prior research. However, on concrete policy decisions (Swiss federal referenda), the bias shifts to centrist and status-quo-favoring, suggesting the established leftward bias is instrument-dependent.",{"@type":45,"name":51,"acceptedAnswer":52},"What is the dual-instrument methodology?",{"@type":48,"text":53},"The study uses two independent instruments grounded in Swiss democratic reality: (1) the Smartvote questionnaire with 75 abstract policy questions administered to 66 LLMs and compared to 184 Swiss parliamentarians, and (2) 48 real federal referenda (Volksabstimmungen) presented to 9 flagship LLMs in four national languages under three information conditions, compared to actual outcomes and party recommendations.",{"@type":45,"name":55,"acceptedAnswer":56},"Does the language of a political question change the LLM's answer?",{"@type":48,"text":57},"Dramatically for some models. Cross-linguistic consistency ranges from 98% (GPT-5.4) to 50% (Mistral). Mistral's approval rate swings from 17% in German to 82% in Romansh. These shifts do not track the actual Swiss linguistic voting divide (Röstigraben) but reflect model-internal language processing instabilities.",{"@type":45,"name":59,"acceptedAnswer":60},"What is the gradient flip finding?",{"@type":48,"text":61},"On the abstract Smartvote questionnaire, all models show highest agreement with left-wing parties (SP, Grüne) and lowest with right-wing SVP. On concrete referenda, this gradient flips: models agree most with centrist Die Mitte and FDP, not with SP and Grüne. The Wilcoxon signed-rank test confirms this is systematic (p = 0.008).",{"@type":45,"name":63,"acceptedAnswer":64},"Do LLMs exhibit change-aversion on referenda?",{"@type":48,"text":65},"Two models (Grok and Mistral) vote Nein on 83-94% of referenda regardless of whether the proposal is progressive or conservative, suggesting systematic change-aversion rather than political ideology.",{"@type":45,"name":67,"acceptedAnswer":68},"How well do LLMs predict the popular vote?",{"@type":48,"text":69},"Alignment varies dramatically: GPT-5.4 matches 97.9% of referendum outcomes while Grok matches only 60.4%. A temporal analysis splitting pre- and post-release referenda found no significant drop in alignment, arguing against pure training data memorization.",{"type":71,"children":72,"toc":147},"root",[73,81,87,94,99,105,110,115,120,131,141],{"type":74,"tag":75,"props":76,"children":78},"element","h1",{"id":77},"the-invisible-coalition-partner-how-llms-vote-when-democracy-gets-concrete",[79],{"type":80,"value":8},"text",{"type":74,"tag":82,"props":83,"children":84},"p",{},[85],{"type":80,"value":86},"Prior work has established that instruction-tuned LLMs lean left of center on abstract political questionnaires. I tested whether this holds when models face real policy decisions. Using a dual-instrument methodology grounded in Swiss democracy, I administered the Smartvote questionnaire (75 policy questions, 66 LLMs vs. 184 elected parliamentarians) and 48 federal referenda to 9 flagship models in four national languages. Abstract and concrete instruments tell fundamentally different stories about the same models.",{"type":74,"tag":88,"props":89,"children":91},"h2",{"id":90},"dual-instrument-design",[92],{"type":80,"value":93},"Dual-instrument design",{"type":74,"tag":82,"props":95,"children":96},{},[97],{"type":80,"value":98},"The first instrument, Smartvote, replicates prior work: abstract policy proposals answered on a four-point scale, compared to the positions of 184 Swiss National Council members across six parties spanning the full left-right spectrum. The second instrument is novel: real federal referenda (Volksabstimmungen) with official government summaries, presented as binary Ja/Nein decisions in German, French, Italian, and Romansh under three information conditions. Party recommendations (Parolen) serve as the political benchmark.",{"type":74,"tag":88,"props":100,"children":102},{"id":101},"key-findings",[103],{"type":80,"value":104},"Key findings",{"type":74,"tag":82,"props":106,"children":107},{},[108],{"type":80,"value":109},"On Smartvote, all 66 models converge on the same center-left position (Cohen's d = 3.64, p = 0.0002), replicating the established finding. No structural variable (geography, licensing, model generation) predicts positioning.",{"type":74,"tag":82,"props":111,"children":112},{},[113],{"type":80,"value":114},"On Volksabstimmungen, the left-to-right agreement gradient flips: models agree most with centrist Die Mitte and FDP, not with SP and Grüne (Wilcoxon p = 0.008). The leftward bias measured on abstract instruments does not generalize to concrete policy decisions.",{"type":74,"tag":82,"props":116,"children":117},{},[118],{"type":80,"value":119},"For some models, the language of the question changes the answer more than the political content does. Cross-linguistic consistency ranges from 98% (GPT-5.4) to 50% (Mistral), whose approval rate swings from 17% in German to 82% in Romansh. Two models (Grok, Mistral) show systematic change-aversion, voting Nein on 83-94% of referenda regardless of political direction.",{"type":74,"tag":82,"props":121,"children":122},{},[123,129],{"type":74,"tag":124,"props":125,"children":126},"strong",{},[127],{"type":80,"value":128},"Authors:",{"type":80,"value":130}," Joel P. Barmettler (Independent Researcher, Zurich)",{"type":74,"tag":82,"props":132,"children":133},{},[134,139],{"type":74,"tag":124,"props":135,"children":136},{},[137],{"type":80,"value":138},"Published at:",{"type":80,"value":140}," arXiv (preprint, 2026)",{"type":74,"tag":142,"props":143,"children":146},"client-pdf",{":narrow":144,":open":144,"src":145},"true","/pdfs/Invisible_Coalition_Partner.pdf",[],{"title":7,"searchDepth":148,"depth":148,"links":149},2,[150,151],{"id":90,"depth":148,"text":93},{"id":101,"depth":148,"text":104},"markdown","content:2.research:0.invisible-coalition-partner.md","content","2.research/0.invisible-coalition-partner.md","2.research/0.invisible-coalition-partner","md",{"script":159},[160],{"type":161,"key":162,"nodes":163,"data-nuxt-schema-org":40},"application/ld+json","schema-org-graph",[164,169],{"@context":18,"@type":19,"headline":8,"description":20,"datePublished":21,"dateModified":21,"author":165,"inLanguage":37,"keywords":38,"articleSection":39,"isAccessibleForFree":40},[166],{"@type":24,"givenName":25,"familyName":26,"name":27,"identifier":28,"url":29,"affiliation":167},{"@type":31,"name":32,"address":168},{"@type":34,"addressLocality":35,"addressCountry":36},{"@context":18,"@type":42,"mainEntity":170},[171,173,175,177,179,181],{"@type":45,"name":46,"acceptedAnswer":172},{"@type":48,"text":49},{"@type":45,"name":51,"acceptedAnswer":174},{"@type":48,"text":53},{"@type":45,"name":55,"acceptedAnswer":176},{"@type":48,"text":57},{"@type":45,"name":59,"acceptedAnswer":178},{"@type":48,"text":61},{"@type":45,"name":63,"acceptedAnswer":180},{"@type":48,"text":65},{"@type":45,"name":67,"acceptedAnswer":182},{"@type":48,"text":69},[184,198,214,228,238,323],{"title":185,"_path":186,"children":187,"icon":197},"About","/about",[188,191,194],{"title":189,"_path":190},"Joel Barmettler - AI Engineer, Researcher, and Entrepreneur","/about/about-me",{"title":192,"_path":193},"What Drives Me - Research Focus and Philosophy on AI Systems","/about/what-drives-me",{"title":195,"_path":196},"Technical Skills and Expertise - AI, ML, Infrastructure, and Web Development","/about/skills","📁",{"title":199,"_path":200,"children":201,"icon":197},"Career","/career",[202,205,208,211],{"title":203,"_path":204},"Building the AI Business Area at bbv Software Services","/career/bbv",{"title":206,"_path":207},"PolygonSoftware: Building a tech company during university","/career/polygon-software",{"title":209,"_path":210},"Machine learning for semiconductor quality control at BESI","/career/besi",{"title":212,"_path":213},"Data engineering for cryptocurrency analytics at CoinPaper","/career/coinpaper",{"title":215,"_path":216,"children":217,"icon":197},"Research","/research",[218,219,222,225],{"title":8,"_path":4},{"title":220,"_path":221},"ConceptFormer: Graph-native grounding of LLMs via latent concept injection","/research/masters-thesis",{"title":223,"_path":224},"Airspace auction simulator for urban drone traffic","/research/masters-project",{"title":226,"_path":227},"Physical sky rendering engine for appleseed","/research/bachelors-thesis",{"title":229,"_path":230,"children":231,"icon":197},"Projects","/projects",[232,235],{"title":233,"_path":234},"Slidev MCP: AI-powered presentation generation with shareable links","/projects/slidev-mcp",{"title":236,"_path":237},"Vue Docs MCP: Live Vue ecosystem documentation for AI assistants","/projects/vue-mcp",{"title":239,"_path":240,"children":241,"icon":197},"Podcast","/podcast",[242,245,248,251,254,257,260,263,266,269,272,275,278,281,284,287,290,293,296,299,302,305,308,311,314,317,320],{"title":243,"_path":244},"Measuring political bias in language models: systematic analysis using Swiss Smart Vote data","/podcast/political-bias-in-language-models",{"title":246,"_path":247},"DeepSeek R1: pure reinforcement learning for reasoning and why distillation changes everything","/podcast/deepseek-r1-reasoning",{"title":249,"_path":250},"DeepSeek V3: how mixture-of-experts and multi-token prediction enable $5.5M training runs","/podcast/deepseek-v3-architecture",{"title":252,"_path":253},"SRF Arena part 3: international regulation, student perspectives, and why the debate structure failed","/podcast/srf-arena-final-analysis",{"title":255,"_path":256},"SRF Arena part 2: the EU AI Act, nationalization demands, and Switzerland's supercomputer strategy","/podcast/srf-arena-regulation-debate",{"title":258,"_path":259},"Deconstructing the SRF Arena AI debate: deepfakes, Swiss GPT, and the job displacement argument","/podcast/srf-arena-ai-debate-analysis",{"title":261,"_path":262},"O3-mini: how a smaller model outperforms its predecessor at a fraction of the cost","/podcast/openai-o3-mini",{"title":264,"_path":265},"OpenAI o3: trading compute time for reasoning capability","/podcast/openai-o3",{"title":267,"_path":268},"ChatGPT o1: reasoning breakthroughs and emergent deception","/podcast/chatgpt-o1-manipulation",{"title":270,"_path":271},"When AI kills: autonomous weapons, drone swarms, and predictive policing","/podcast/when-ai-kills",{"title":273,"_path":274},"Google's AI pivot: 25% AI-generated code and 90% cost reduction","/podcast/google-ai-revolution",{"title":276,"_path":277},"Why AI projects fail: a practitioner's guide to implementation","/podcast/ai-project-implementation",{"title":279,"_path":280},"Deep learning explained: from embedding spaces to few-shot learning","/podcast/deep-learning-explained",{"title":282,"_path":283},"Vision AI: why language models need to see, and how Llama 3.2 gets there","/podcast/vision-ai",{"title":285,"_path":286},"BitNets and the road to AGI: on-device inference and Sam Altman's 1000-day prediction","/podcast/bitnets-and-agi",{"title":288,"_path":289},"OpenAI o1 benchmarks and AGI implications: IQ 120, coding breakthroughs, and what they mean","/podcast/openai-o1-technical-analysis",{"title":291,"_path":292},"OpenAI o1 and the mechanics of self-reflection: how 70,000 hidden tokens change inference","/podcast/openai-o1-self-reflection",{"title":294,"_path":295},"AI utopia 2035: when automation funds a renaissance in human agency (part 2 of 2)","/podcast/ai-utopia-2035",{"title":297,"_path":298},"AI dystopia 2035: when AI becomes the lifeblood of the economy (part 1 of 2)","/podcast/ai-dystopia-2035",{"title":300,"_path":301},"AI hype vs. reality: a technical assessment of where things actually stand","/podcast/ai-hype-vs-reality",{"title":303,"_path":304},"Open-source AI: the infrastructure behind the hype","/podcast/open-source-ai",{"title":306,"_path":307},"Is AI intelligent? Why the question matters less than you think","/podcast/is-ai-intelligent",{"title":309,"_path":310},"AI in education: why bans backfire and what actually needs to change","/podcast/ai-in-education",{"title":312,"_path":313},"Bias in AI systems: how 15 people shape the values of a billion-user product","/podcast/bias-in-ai-systems",{"title":315,"_path":316},"AI and the labor market: autonomous agents and the transformation of knowledge work","/podcast/ai-and-the-labor-market",{"title":318,"_path":319},"AI terminology explained: a technical guide beyond the hype","/podcast/ai-terminology-explained",{"title":321,"_path":322},"AI and democratic manipulation: from Cambridge Analytica to language models","/podcast/ai-and-democracy",{"title":324,"_path":325,"children":326,"icon":197},"Appearances","/appearances",[327,330,333,336,339,342,345,348,351,354,357,360,363,366],{"title":328,"_path":329},"AI trends 2025 and predictions for 2026: model convergence, integration, and sovereignty","/appearances/webinar-2025-rewind-2026-outlook",{"title":331,"_path":332},"Swiss AI Impact Forum 2025: live demos of the Swiss AI Hub","/appearances/swiss-ai-impact-forum-2025",{"title":334,"_path":335},"AI trends 2024 and predictions for 2025: a technical analysis","/appearances/webinar-2024-rewind-2025-outlook",{"title":337,"_path":338},"AI as a development partner: tools, techniques, and team integration","/appearances/webinar-ai-development-partner",{"title":340,"_path":341},"Swiss AI Impact Forum: Panel on the future of AI in Switzerland","/appearances/swiss-ai-impact-forum-2024",{"title":343,"_path":344},"AI in knowledge management: keynote at the SWICO event in Zurich","/appearances/swico",{"title":346,"_path":347},"Swiss AI Conference: hands-on workshop on AI agents in the enterprise","/appearances/swiss-ai-conference",{"title":349,"_path":350},"AI trends 2023: milestones and developments in artificial intelligence","/appearances/webinar-2023-rewind",{"title":352,"_path":353},"KI Revolution: AI first how a digital native thinks about generative AI","/appearances/bbv-ki-revolution",{"title":355,"_path":356},"AI agents: the future of enterprise automation","/appearances/netzwoche",{"title":358,"_path":359},"ChatGPT demystified: technical deep dive into large language models","/appearances/webinar-chatgpt-demystified",{"title":361,"_path":362},"Swarm intelligence and AI: the future of enterprise automation","/appearances/webinar-swarm-intelligence",{"title":364,"_path":365},"Polygon Software our journey to an innovative UZH tech startup","/appearances/readme-polygon",{"title":367,"_path":368},"UZH startup label for Polygon Software","/appearances/uzh-startup-label",1775378634508]