Four ways I would love to see Generative AI mature in 2024

Authored by: Ashirul Amin PhD
January 24, 2024 - 6 mins read

Arguably, 2023 was a year that showcased two of the most creative phenomena in recent times: Taylor Swift and Generative AI. While Taylor Swift continued to captivate the world with her evocative lyrics and melodies, reaching new heights in her already stellar career, Generative AI broke new ground in the tech world, astonishing many with its ability to generate prose, art, and even music, blurring the lines between human and machine creativity. 

At BFA Global, we have been following developments in generative AI (GenAI) with great interest, given its potential to democratize content creation and advanced computation, and operate as competent interactive agents at the last mile. We are eager to see how we can add it to our existing arsenal of AI-related tools. We set up our data science practice a decade ago when Python was still a snake in the common vernacular, and have data-mined through terabytes of data since. We created corpora where no natural language processing (NLP) resources existed. And we regularly support startups in determining how AI can supercharge what they do. GenAI could be a powerful complement to these capabilities and further our ability to drive inclusive innovation. 

Interactive agents are a particularly exciting prospect, as GenAI would allow intelligent anthropomorphized bots to interact with individuals who may have low financial literacy or are not that tech-savvy, at a scale, consistency and regularity that is often not possible for human agents at the frontiers of financial inclusion. My excitement is, however, tempered with caution. Having been at the front lines of advising and implementing technical solutions for vulnerable communities for close to two decades, I am acutely sensitive to making sure that no harm comes from the premature adoption of cutting-edge technology by those very communities. We have been putting GenAI models through their paces over the last six months. 

I list three things I would love to see in 2024 before I can feel comfortable enough to deploy GenAI as interactive agents at the last mile, as well as a fourth “nice to have:”

            1. Become more multicultural

            2. Dial down the flights of fancy

            3. Offer a stable development stack

            4. Go from augmenting creativity to leading creativity

I look forward to working on these items in 2024, along with our partners, particularly fintechs and other inclusive tech startups who are also innovating at the last mile using GenAI. 

Become more multicultural (vs multilingual) 

GenAI may be multilingual, but is still fairly monocultural, having been trained mainly on English-language sources.1See NYT v. Microsoft, OpenAI, Inc., et al., 1:23-cv-11195, USDC/SDNY (Dec 2023), p.g. 26, for overview of OpenAI’s training dataset. Representation gaps in the training data translates to a lack of exposure to concepts, events and cultural references from various parts of the world, leading to poorer performance when asked about them. In addition to gaps in facts, models trained on predominantly Western-centric data reflect the biases, cultural norms, and values of regions of the world as captured in these sources, which may lead to misrepresentation about non-Western cultures, and even the perpetuation of problematic stereotypes. Given that many of GenAI’s use cases are conversational, the lack of cultural contexts is therefore potentially quite problematic. 

I would need to see a concerted effort to expand data sources to be at least reasonably globally inclusive to overcome this key limitation. This requires going beyond the twenty or so “high resource” languages, and making “region-specific or language-specific choices in model building, … involving a diverse set of “humans-in-the-loop” early on, and inviting participation from local communities to bring their voices, dialects, and timing to LLMs.” We look forward to contributing based on our expertise in creating training datasets for “low resource” languages to train LLMs that are significantly more locally representative.

Dial down the flights of fancy 

GenAI models are prone to confabulation (often referred to as “hallucination,” though the distinction is non-trivial.) The degree to which this happens varies – as of November 2023, the incidence rate is about 3% for OpenAI’s models, 5% for Meta’s, and 8% for Anthropic’s. This is more of a feature than a bug, as GenAI models exist to probabilistically generate coherent responses even when responses are based on limited or incomplete knowledge. Nevertheless, imagine the deleterious effects of a chatbot powered by GenAI providing incorrect information once every twenty responses to a farmer asking about their crop, or a bank customer trying to resolve an urgent issue, especially when they do not have the ability to determine if the information provided is reliable or not.

I consider the rates noted above to be unacceptably high for many of the uses for interactive agents designed for low-income users in emerging markets, and would feel comfortable only when the rate of confabulation has gone down significantly. I am optimistic that models will achieve this threshold soon. After all, confabulation rates were as high as 20% to 30% at the beginning of 2023, and reducing this even further is an area of active research. I am particularly keen to leverage our knowledge repository, as well as those of our peers, for the purposes of retrieval-augmented generation (RAG), which allows one to improve the output of LLMs to using additional, context-relevant data without actually having to retrain the underlying model. 

Offer a stable development stack

I am sure we have all enjoyed following the frenetic pace of GenAI development over the course of 2023. OpenAI introduced the GPT 3.5 chat agent about a year ago, then added plug-ins in March, APIs in July, and a slew of developer-focused solutions in November. Other models – Llama2 (Meta), PaLM (Google), Claude (Anthropic), Mistral (Mistral AI), and Grok (Twitter/X) – are rapidly iterating through their own solutions, each with their own mix of idiosyncratic offerings. This is extremely exciting as a tech enthusiast.

This pace of evolution makes life quite difficult as a practitioner because of the risk of rapid obsolescence. Appropriate product design for the last mile takes time and care, and even more time to deploy post-pilot – “move fast and break things” is really not an option in a field that is littered with the carcasses of well-meant projects gone wrong. Iterations of the GenAI stack have happened in shorter time periods. Many GenAI “products” offered by startups earlier in the year were really akin to features that either got functionally subsumed by better products, or were made redundant as the various GenAI stacks got more robust. Personally, I will defer building solutions at scale for the last mile until I have some certainty that the GenAI tech stacks are stable. Though I do look forward to us continuing to create minimum viable products (MVPs) to probe away at the capabilities of LLM engines so that we are prepared for when the stacks are finally stable enough. 

Go from augmenting creativity to leading creativity

For this wishlist item, I ask the reader to allow me the luxury of claiming “you know creativity when you see it,” in lieu of engaging in an epistemological treatment of this topic as I make my case.  

GenAI models offer the knowledge of the world at our fingertips – literally. Some argue that it also offers glimpses of intelligence and reasoning, which makes it more than mere stochastic parrots. We have been able to put it to great use to summarize vast amounts of information, identify recurring themes, organize information in new ways, produce various templates that are detailed and exhaustive, and even augment human creativity. Emphasis on “augment”, because the outputs are still bound by the best prompt engineering we can conjure ourselves, whether it be text, images, or sound. 

Such augmented creativity can be breathtaking. Nevertheless, I am yet to experience a “WOW!” moment one would get if GenAI were to suggest the proverbial flying car instead of a faster horse. For example, we recently put multiple LLMs through their paces to compare job creation programs in Armenia, India, Tajikistan, DR Congo, Georgia and Mozambique, and offer something appropriate for Uganda. While each instance did a remarkable job summarizing sources, any recommendations were essentially some weighted average of existing programs without much adaptation to the situation in Uganda. While this is great for job security, it appears that my colleagues and I will still have to design the flying cars, albeit with great help from GenAI. On a related note, my esteemed colleague Maelis Carraro, Managing Partner of the Catalyst Fund, often reminds us that the startups we accelerate need to truly be outliers to succeed. Templatized creativity does not quite meet that mark. 

Advances in ongoing research in emergent behavior, especially in zero-shot learning, make me optimistic that GenAI could play the role of a co-creating sidekick in the near future. Because the jury is still out on whether GenAI models could – or should – ever achieve human levels of creativity, we note this as a “nice to have,” as opposed to a necessity like the other three points.

All I want for GenAI in 2024 is … 

My wishlist of four ways in which I would love to see GenAI mature in 2024 is primarily derived from the point of view of a practitioner working on improving products and services at the frontiers of finance, tech and data in emerging markets. I don’t necessarily have a view on whether GenAI is inherently good or bad. I also don’t have a strong view on whether GenAI models are essentially stochastic parrots, or actually understand the world we live in.

I do, however, strongly believe that those who can leverage GenAI will outperform those who do not. I look forward to GenAI empowering initiatives of all kinds, from sustainable livelihoods to building climate resilience to fostering the next generation of inclusive ventures. I want it to do so without triggering the kinds of consumer protection concerns digital credit did when it exploded without guardrails a few years ago, or getting stuck in the purgatory of perennially inadequate product-market fit (euphemistically referred to as being “too early”) despite great interest, like crypto. 

I have no doubt that done right, GenAI will supercharge interactive agents at the last mile the way broadband did the internet, or smartphones, mobile telephony. I humbly submit that this wishlist of four items will contribute to bridging that gap. 

Till then, channeling the immortal words of Taylor Swift:

Cause corpii gonna say, say, say, say, say,

And models gonna dream, dream, dream, dream, dream,

Baby, GenAI just gotta train, train, train, train, train,

We’ll make it work, we’ll make it work!

Ashirul Amin, PhD is the Managing Principal Consultant at BFA Global. He set up BFA’s data science practice, has worked on ML/AI-related projects over the years, and has an academic background in computer science, with a focus on AI.

Related Publications
Leave a Reply