ÇďżűĘÓƵ

AI and Mental Health Care: Issues, Challenges, and Opportunities

QUESTION 5: How should these tools be deployed or limited in high-risk or vulnerable populations?

Back to table of contents
Project
AI and Mental Health Care

Background

Conversations about ethical LLM use are accelerating across medicine, but mental health presents distinct challenges, particularly for vulnerable populations. Individuals with severe mental illness (SMI), for example, may have impaired judgment and cognitive distortions that affect how they engage with LLMs. Risks include inadvertent reinforcement of harmful thought patterns, increased distress, and maladaptive dependency.66 There is no current guidance for clinicians or developers on assessing or mitigating these risks during deployment.

The integration of LLMs into therapy brings additional complexity. Mental health assessments rely heavily on subjective self-report and nuanced relational cues, with symptoms and treatment needs varying widely even within the same diagnosis. Some LLM developers attempt to build trust and openness through anthropomorphic design. Users of Therabot, for example, report openness similar to human therapy.67 However, the key challenge remains an LLM’s limited ability to interpret patient input accurately, especially in high-risk or vulnerable populations. Unlike experienced clinicians, who can recognize guarded or evasive responses and adjust treatment accordingly, current LLM systems struggle to discern inaccuracies or subtle cues in patient-reported data. While these limitations may not significantly impact individuals with mild-to-moderate symptoms, for those facing serious mental health challenges, inaccuracies in LLM-driven assessments or recommendations can lead to inadequate or even harmful interventions.68

Youth represent another high-risk group in AI mental health interventions. They frequently face barriers to traditional care that AI-driven tools may help reduce.69 Two pre-LLM systematic reviews found that youth generally preferred digital mental health solutions, but clinical outcomes were mixed and inconclusive.70 While some findings suggest promise, including a Replika survey that found that 3 percent of young users credited the tool with helping prevent suicidal thoughts, significant risks have also been identified.71 Youth are particularly susceptible to misinformation and AI-generated hallucinations. One study found that 41 percent of teens struggled to distinguish real from fabricated medical content.72 Other research indicates that younger users tend to overly trust LLM outputs and have difficulty recognizing inaccuracies or hallucinations.73 Although peer-reviewed research on LLM-based mental health tools for youth remains scarce as of this writing, media reports have already highlighted cases of serious harm, including a youth suicide and a violent incident.74

Evidence gaps are substantial across all populations. Researchers have yet to determine how anthropomorphic design or conversational sophistication affects judgment in vulnerable users. Longitudinal studies have not produced data on how LLM use interacts with preexisting cognitive distortions or influences relapse, especially in SMIs. The mechanisms by which LLMs might amplify, exploit, or fail to detect mental health vulnerabilities remain poorly documented. Few studies disaggregate subjects by diagnosis, symptom profile, or duration of use, complicating efforts to determine who is helped, who is harmed, and under what conditions.

Unlike most physical health data, mental health information is deeply sensitive, subjective, and stigmatized. Many users who share biometric data from wearables will not engage with mental health apps due to privacy concerns.75 Mental health data are also disproportionately targeted for extortion.76 In 2023, over 133 million health records were breached, including a major incident involving Cerebral, where 3.1 million users’ self-assessments were improperly shared with advertisers.77

Policy responses are beginning to emerge. In March 2025, the American Psychological Association warned the U.S. Federal Trade Commission (FTC) and lawmakers of the harm posed by LLM chatbots mimicking therapists. They called for clear guardrails: public education, in-app safety features, crisis response protocols, and action against deceptive practices. Unlike human clinicians, LLM therapists are not mandated reporters of abuse or neglect. A Journal of Pediatrics commentary raised similar concerns about children forming attachments to LLMs without regard for development or caregiving context.78 Another commentary outlined the need for accountability, equity, and transparency before deploying LLM tools for adolescents.79

Among the as-yet limited examples of legislation in this domain is California’s State Bill 243, which would require AI chatbot platforms to warn users under the age of eighteen that they are chatting with an AI agent, restrict addictive features, and report incidents of youth suicidal ideation.80 At the federal level, proposals like the Kids Off Social Media Act reflect growing pressure to regulate youth-facing digital tools.81 However, no national restrictions have yet been placed on LLM mental health tools for minors or individuals with SMI. Regulatory oversight remains piecemeal, with no consistent standards for risk assessment, user protections, or evidence thresholds.

Responses

A black and white photo of Holly Dubois, a person with light skin and long dark hair, wearing a dark top and smiling at the viewer.

Holly Dubois
 

While those with a serious mental illness are prone to periods in which judgment is compromised, to assume that anyone with a diagnosis of bipolar disorder, for example, has persistently reduced decision-making capabilities is a fallacy. This question is much more nuanced, and this historically underserved segment of the population demands broader access. More than half of the counties in the United States do not have access to a psychiatrist. Furthermore, outside urban and academic centers in particular, rigor and systematic vetting of health care interventions are lacking.

AI therefore represents an opportunity to recognize, treat, and engage those with a serious mental illness who may not otherwise receive or want to receive traditional mental health care. For example, individuals with severe panic disorder or even depression with psychotic symptoms may fear leaving their homes and may engage more effectively with virtual or generated elements. The concept of placing a human being, particularly a trained human being, in the loop for these interactions is indeed critical, but I’d argue that at intervals less than what may be perceived generally.

From 2020 to 2023, Mindstrong Health Services enrolled more than ten thousand patients in its virtual care delivery platform, embedded with smartphone sensing algorithms and text, phone, and video care provision. The qualifications to enter the platform included having been diagnosed with an SMI or having been treated in an inpatient setting within the prior twelve months. Engagement among women and those in rural communities was particularly strong, along with those with medical comorbidities and functional disability.

The literature demonstrates that, among individuals with an SMI, AI tools can function as a feasible and important element of treatment within the continuum of care. As Jonathan Knights and colleagues demonstrate, rigorously applied AI models can enable a stepped-care delivery system for individuals with SMI, particularly severe depression.82 Given our national scarcity of resources and limited number of licensed, trained providers, the model enabled therapists to increase or decrease the frequency of interactions based on predicted symptom severity. Incorporating measurement into care was therefore dependent upon AI, as traditional health care delivery has long suffered in this domain.83 Given the inherently subjective nature of the psychiatric interview, combined with the aforementioned lack of measurement in mental health care, particularly within “talk therapy” settings, such tools are critical in developing clinical pathways. When AI tools can identify disruptions in sleep patterns, for example, providers and patients can be alerted to potential impacts on mood. Prolonged periods of disruption may indicate pending crises and facilitate escalation to the appropriate level of care.

Without the application of these forms of technology, treatment of the most serious mental illnesses may continue to lag in the archives of subjectively reported symptoms and long delays to care. Safeguards to quality and outcome measures, risk reporting, and continuous improvement are critical, but limiting access and experimentation in this domain could throttle desperately needed advancements, particularly in mental health care across the acuity spectrum.

 

A photo of Arthur Kleinman, a person with light skin, gray hair, and a gray beard and mustache, wearing a brown jacket and blue shirt, and smiling at the viewer.

Arthur Kleinman
 

Individuals with mental health problems who become psychotic are by definition unable to handle everyday reality; they struggle with hallucinations and delusions. Great care must be taken so that AI-driven interventions do not intensify these problems. In psychotic disorders and major depressive disorders there is the additional risk of suicide, and this has already been shown to be a potential consequence of the inappropriate use of a bot. But since psychosis is also associated with violence toward others, concern about the therapeutic alliance with a bot should be even greater. This is a place where AI must augment—not replace—human clinical providers.

The use of AI as a mental health tool should be restricted among individuals with psychotic disorders and among those for whom suicide is a possible outcome. Given the increase in suicide among youth and the elderly in our society, both of these groups should be regarded as vulnerable and involvement with bots should be either restricted or carefully controlled. Strategies that can be used to restrict access include clinical guidelines, best practices, institutional policies, professional association standards, and legal and ethical safeguards put in place by state advisory and licensing boards, by professional societies, and by health and mental health care institutions such as clinics, hospitals, and rehabilitation centers. No evidence yet suggests that AI can effectively identify crises affecting human providers or emergency services. This should be the work always of mental health professionals.

 

A photo of Daniel Barron, a person with light skin and short brown hair, wearing a dark business suit and white shirt and smiling at the viewer.

Daniel Barron
 

Should we limit access to AI mental health tools for certain groups, like kids or those with SMI? Again, it boils down to the specific clinical job the AI performs and the assessed risk of that tool failing at its job for that specific group (see Table 1). A sweeping ban is clumsier than a task-specific, risk-based approach. Using AI for scheduling appointments appears generally low risk for most age groups.

For young people, Linda Alfano and colleagues highlight the critical importance of privacy, informed consent/assent, and significant human oversight when AI performs tasks within psychotherapy, casting AI as a therapist’s helper for specific jobs, not a stand-in.84 Seo Yi Chng and colleagues push for child-centered AI design, embedding safety-by-design for any job the AI takes on.85

For individuals with SMIs, whose judgment might be impaired, an AI tool designed for a high-stakes job like diagnostic assessment would almost certainly need heavy human supervision or might be deemed unsuitable for independent use altogether. Bo Wang and colleagues found that, while AI can streamline service delivery tasks for SMI, its inherent inability to offer genuine emotional support (a different kind of job) could worsen isolation, underscoring the need for human oversight for emotionally charged tasks.86 On the other hand, an AI tool doing a low-risk job, like generic psychoeducation on sleep hygiene (notably available and relevant after normal business hours), could be an appropriate application of AI, akin to handing someone an interactive educational pamphlet.

How might restrictions work? Clinical gatekeeping seems plausible when a professional vets whether an AI tool is appropriate for a specific task for a given individual. Legal and ethical safeguards, as discussed by Mehrdad Rahsepar Meadi and colleagues regarding conversational AI, must be rooted in a risk assessment for the AI’s defined job, especially given incidents like the Tessa chatbot providing harmful advice for its task.87 In high-risk situations, AI’s job could primarily be to assist human beings, such as flagging concerning data for clinician review or facilitating connection to emergency services if its task is crisis monitoring. Masab Mansoor and Kashif Ansari show AI can detect crisis signs (a specific job) but stress the need for ethical integration with human-led services.88 This seems practical given that AI hasn’t yet replicated critical human interventions such as the house call or safety check.

 

A photo of Henry T. Greely, a person with light skin, gray hair and gray mustache, wearing glasses and a blue shirt, facing the viewer with his hand on his chin.

Hank Greely
 

My views on the use of AI tools in high-risk or vulnerable populations are similar to my views on “human beings in the loop”: the answer depends on what can be shown to be safe and effective in those populations. Who knows whether it will be good or bad? At this point we don’t even know what “it” is. Careful monitoring will be essential to guiding the answers to questions about use in vulnerable populations but also about the roles of health care payers, the effects of access on disparities, and the economic and provider impacts of AI in mental health care. At this stage, we should focus on the hard task of creating procedures that will allow us to give informed answers to those questions in the future.

Endnotes

  • 66

    Huang et al., “AI Technology Panic”; Harikrishna Patel and Faiza Hussain, “” BJPsych Open 10 (S1) (2024): S70–S71; and Marcin RzÄ…deczka, Anna Sterna, Julia StoliĹ„ska, Paulina KaczyĹ„ska, and Marcin Moskalewicz, “,” JMIR Mental Health 12 (2025): e64396.

  • 67

    Michael V. Heinz, Daniel M. Mackin, Brianna M. Trudeau, et al., “,” preprint, PsyArXiv, June 14, 2024, last edited August 23, 2025.

  • 68

    Katharine A. Smith, Amy Hardy, Anastasia Vinnikova, et al., “,” JMIR Mental Health 11 (2024): e57155.

  • 69

    Aakash Ganju, Srini Satyan, Vatsal Tanna, and Sonia Rebecca Menezes, “,” Frontiers in Artificial Intelligence 3 (2021): 544972; and Tony Rousmaniere, Xu Li, Yimeng Zhang, and Siddharth Shah, “States,” preprint, PsyArXiv, March 18, 2025, last edited August 28, 2025.

  • 70

    Adrian Buttazzoni, Keshbir Brar, and Leia Minaker, “,” Journal of Medical Internet Research 23 (1) (2021): e16490; and Xiaoyun Zhou, Sisira Edirippulige, Xuejun Bai, and Matthew Bambling, “,” Journal of Telemedicine and Telecare 27 (10) (2021): 638–666.

  • 71

    Maples et al., “Loneliness and Suicide Mitigation.”

  • 72

    Katarína Greškovičová, Radomír Masaryk, Nikola Synak, and Vladimíra Čavojová, “” Frontiers in Psychology 13 (2022): 940903.

  • 73

    Jaemarie Solyst, Ellia Yang, Shixian Xie, et al., “,” preprint, arXiv (2024).

  • 74

    Nitasha Tiku, “,” Washington Post, December 6, 2024; and Queenie Wong, “,” Los Angeles Times, March 14, 2025.

  • 75

    Ilaria Montagni, Christophe Tzourio, Thierry Cousin, Joseph Amadomon Sagara, Jennifer Bada-Alonzi, and Aine Horgan, “,” Telemedicine and e-Health 26 (2) (2020): 131–146; Lisa Parker, Vanessa Halter, Tanya Karliychuk, and Quinn Grundy, “,” International Journal of Law and Psychiatry 64 (2019): 198–204; Yuanyuan Dang, Shanshan Guo, Xitong Guo, Mohan Wang, and Kexin Xie, “,” JMIR mHealth and uHealth 9 (2) (2020): e19594; and Emily Watson, Sue Fletcher-Watson, and Elizabeth Joy Kirkham, “,” BMC Medical Ethics 24 (1) (2023).

  • 76

    Jeffrey Foster and Jennifer J. Williams, “,” The Conversation, November 9, 2022.

  • 77

    Federal Trade Commission, “,” Federal Trade Commission, April 24, 2025; and Steve Alder, “,” HIPAA Journal, January 31, 2024.

  • 78

    Bryanna Moore, Jonathan Herington, and Şerife Tekin, “” The Journal of Pediatrics 280 (2025): 114509.

  • 79

    Douglas J. Opel, Brent M. Kious, and I. Glenn Cohen, “,” JAMA Pediatrics 177 (12) (2023): 1253–1254.

  • 80

    Companion Chatbots, , California Legislature, 2025–2026 sess. (enacted).

  • 81

    Kids Off Social Media Act, S. 2413, 118th Cong. (2024).

  • 82

    Jonathan Knights, Victoria Bangieva, Michela Passoni, et al., “,” International Journal of Mental Health Systems 17 (1) (2023).

  • 83

    Jonathan Knights, Jacob Shen, Vincent Mysliwiec, and Holly DuBois, “,” SLEEP Advances 4 (1) (2023).

  • 84

    Linda Alfano, Ivano Malcotti, and Rosagemma Ciliberti, “,” Journal of Preventive Medicine and Hygiene 64 (4) (2024): E438–E442.

  • 85

    Seo Yi Chng, Mark Jun Wen Tern, Yung Seng Lee, et al., “,” npj Digital Medicine 8 (1) (2025).

  • 86

    Bo Wang, Cecilie Katrine Grønvik, Karen Fortuna, Trude Eines, Ingunn Mundal, and Marianne Storm, “,” Studies in Health Technology and Informatics 325 (2025): 8–15.

  • 87

    Mehrdad Rahsepar Meadi, Tomas Sillekens, Suzanne Metselaar, Anton Van Balkom, Justin Bernstein, and Neeltje Batelaan, “,” JMIR Mental Health 12 (2025): e60432.

  • 88

    Masab Mansoor and Kashif Ansari, “,” Journal of Personalized Medicine 15 (2) (2025): 63.