Lmst

Shiwali Mohan | शिवाली मोहन boosted:

Our recent work at #SRIInternational studies if medical #GenerativeAI #GenAI systems can support patients' information needs.

Paper: https://arxiv.org/abs/2402.00234

The short answer, unsurprisingly, is no.

Turns out #AI #ML science lacks methods to evaluate the performance and usefulness #GenAI (and other methods) in real-world, human contexts. The methods - accuracy on benchmark tasks - fall short of measuring effectiveness.

This is a critical gap that needs serious thought. (1/n)

The paper lays out an approach to evaluate #GenerativeAI's usefulness in addressing human needs. It applies a drastically different lens on #AI and #ML - instead of #AI_Is_Automation (that replaces humans), we take the view that #AI_Is_Augmentation (that supports humans in our tasks).

Paper: https://arxiv.org/abs/2402.00234

#responsibleai #humancentereddesign #humanfactors #humancomputerinteraction #HumanAwareAI#PatientEducation #patientcenteredcare (5/n)

And, the results are in:
1. High error rates (>35%) observed in both systems. This is particularly concerning because patients cannot tell when the system is providing misinformation.
2. Information generated by these systems was characteristically different from what a doctor provides, implying that they don't address the patients' needs. (4/n)

The paper describes an interesting methodology. We study if #GenerativeAI can support the information needs that patients have when they are navigating the healthcare systems. Particularly, when they are trying to understand their scans and reports to educate themselves. We studied patient-provider interactions to identify 10 types of information needs patients have. From these, we generated evaluation datasets to measure how well #ChatGPT and #MedFlamingo. (3/n)

#AI, #ML science has a lot to learn from #HCI, #Psychology, #economics on evaluation methods.

Datasets are curated with often implicit assumptions about what the usecase is. E.g, #GenAI systems (#MedFlamingo, #MedPalm) are evaluated with QA from #USMLE. The questions in these datasets are designed to test a clinician's knowledge and memory. A model's performance on these datasets tells us NOTHING about if it is a good information source for lay persons. (2/n)

Our recent work at #SRIInternational studies if medical #GenerativeAI #GenAI systems can support patients' information needs.

Paper: https://arxiv.org/abs/2402.00234

The short answer, unsurprisingly, is no.

Turns out #AI #ML science lacks methods to evaluate the performance and usefulness #GenAI (and other methods) in real-world, human contexts. The methods - accuracy on benchmark tasks - fall short of measuring effectiveness.

This is a critical gap that needs serious thought. (1/n)

@bwaber It is infinitely sad that our #technocratic culture has created this delusion that the solution to all problems lies in #technology and more recently in #AI. This delusion drives policy in several countries which is detrimental to the population.

Meanwhile, #AI and #ML have invested very little in evaluating if these methods solve problems that exist in developing world context but constantly clam that they have fixes.

Shiwali Mohan | शिवाली मोहन boosted:

My home city of #Seattle recently passed an ordinance that bans caste discrimination. The Indian diaspora in the Seattle is split.

https://www.cnn.com/2023/02/22/us/seattle-bans-caste-discrimination-cec/index.html

#Caste #India #Hindu #Immigrants #Bahujan #Dalit

My friendly #Indians - note that the entrance cutoff for post graduate #pg #medical #training is now 0!

Private educational institutes that charge exorbitant tuition now can accept and 'train' candidates who left their exam blank but can afford tuition. #reservation for the #rich.

Looking forward to outrage from the #merit crowd who are offended because #Dalit/#Bahujan folk don't have to score as much as the #general folk to access education.

https://timesofindia.indiatimes.com/education/news/neet-pg-counselling-2023-mohfw-sets-cut-off-percentile-to-zero-across-all-categories/articleshow/103815816.cms

#India #merit

3. Recent #GenerativeAI is one of the most expensive forms of computation to exist. Even for particularly simple problems, it takes significantly more energy.

Without understanding the constraints that exist in medical systems at #LMICs, can we really claim that this technology would be revolutionary?

For #BMGF to support these claims without sufficient critical analysis is counterproductive to both #AI and #HealthCare in #LMICs.

(n/n)

These wild claims highlight several problems with the #AI #ML tech and research ecosystem.

1. #AI #ML experts make extremely inflated claims without attempting to understand the context of the problem. It is a failure of our field.

2. #AI as a term invites everyone to speculate without understanding the computational methods that underlie these systems. Computations have constraints, algorithms are specific shortcomings.

(4/n)

This means that to deploy #AIEnabledUltrasound you need to have experienced technicians who can use the ultrasound machine to take informative images.

An experienced technician is an expert of anatomy, the technology, as well as of diseases. If they are taking the most informative images, they might as well tell what is going on what is wrong. #AI #ML can only do the last part, and that too, to an un uncertain degree.

(3/n)

All #AI for radiology work I know about takes as input an image and generates a prediction of the disease. All this work assumes access to informative image samples over which prediction can be made.

The ultrasound is done by a technician. Most scans are just noise. The technician has to find the right position from which an informative image can be taken. We don't have #AI #ML methods that can do this.

So, what does this mean? 2/n

I am going through the recent #GoalKeepers 2023 report released by #billandmelindagatesfoundation

I am extremely puzzled by the claim about #AIEnabledUltrasound that is expected to revolutionize maternal healthcare in low and middle income countries #LMIC.

I have been poring over #AI #ML research for the past few months on radiology images and it is definitely NOT up to the challenge.

A thread below.

https://www.gatesfoundation.org/goalkeepers/report/2023-report/#Intro

#AI #ML #Medicine 1/n

If Twitter is now X, does that mean that we can have Twitter back as a platform?

#technology #musings

@UlrichJunker

An additional thing to watch out for: in decisions by a committee, a candidate from a non-dominant background/identity will face more resistance in comparison with dominant identities.

For any candidate, there are reasons to give them the job and there are reasons not to. For non-dominant folks, the committee starts gravitating towards reasons not to hire. And, the decision seems rational/safe.

You have to be open to arguing with your peers - a hard task!

@UlrichJunker

Clearly articulating what skillset a job needs and then having some objective evaluation would help a lot.

For several roles, the criterion is very abstract and subjective like a 'good fit' or 'potential. These is very, very hard to prove as a candidate.

When I meet an Indian woman candidate, I feel an immediate connection & appreciation for her struggles. I'm sure white men feel similarly when they meet candidates that share the same identity. Self-awareness is helpful too.

@UlrichJunker
Organizations have metrics to define success. E.g for a researcher - $ raised, papers written, invited talks given, students mentored etc.

For a member of non-dominant groups, often, these measures of success are out of the window. The discussion very quickly turns into potential, visibility, fit, etc. - all subjective measures.

In my experience, the bias is not to keep people behind. It exists to bring a certain type of people ahead based on pure subjective evaluations.

#DEI discussions in #academe, #HigherEd, #science focus significantly on '#merit'. Conservatives argue for prioritizing 'objective merit' while the liberals argue for expanding the definition of 'merit'.

Both devalue an individual's capacity.

If 'merit' is was cleanly defined & was consistently enforced, our communities would be more diverse already. People from non-dominant communities are NOT lacking in merit - _the main_ issue is subjective judgments that override objective measurements.

@cwebber Taking 'AI' to court awards it an undeserved privileged status among software, adding to the hype.

#AI IS software. It is not written by software engineers but inferred from input-output pairs curated by a community of human engineers, trainers who label or generate signals etc.

It should be held to the same scrutiny, correctness etc. as all other built systems.

Don't take 'AI' to court - take the companies to court who are heavily incentivized to inflate what can be done.

Client Info