What is going on with AlphaFold3?

2024-05-18

A revolution in the field of Protein Science?

_config.yml

On 8th May 2024, Google Deepmind and Isomorphic Labs introduced the world to their new tool for protein structure prediction, AlphaFold3, a more powerful version of the already existent AlphaFold2, with which Google Deepmind had already reconstructed more than 200 millions protein structures (almost every known protein) and crashed the a priori protein structure prediction challenge that had been chasing Bioinformaticians for decades (I talked about it in more detail here).

Are we on the verge of another revolution? Is AlphaFold3 really a game changer as its predecessor was? In this blog post, we’ll explore the potential breakthroughs and new applications, as well as some limitations that the authors themselves recognized.

What’s new?

If you read the abstract of the paper accepted by Nature and published, open-access, on their website, you will see some interesting news:

The introduction of AlphaFold 2 has spurred a revolution in modelling the structure of proteins and their interactions, enabling a huge range of applications in protein modelling and design. In this paper, we describe our AlphaFold 3 model with a substantially updated diffusion-based architecture, which is capable of joint structure prediction of complexes including proteins, nucleic acids, small molecules, ions, and modified residues. The new AlphaFold model demonstrates significantly improved accuracy over many previous specialised tools: far greater accuracy on protein-ligand interactions than state of the art docking tools, much higher accuracy on protein-nucleic acid interactions than nucleic-acid-specific predictors, and significantly higher antibody-antigen prediction accuracy than AlphaFold-Multimer v2.3. Together these results show that high accuracy modelling across biomolecular space is possible within a single unified deep learning framework.

Let’s break this down, so that Biologists can understand AI concepts and AI Scientists can understand Biology ones:

0. Let’s introduce some terminology

0a. For the Biologists

  • Machine Learning: Machine Learning is the process with which computers learn to abstract from some data they have not based on human-made instructions, but on advanced statistical and mathematical models
  • Deep Learning: Deep Learning is a Machine Learning framework which is prominently designed on Neural Networks and uses a brain-like architecture to learn.
  • Neural Network: A Neural Network is somewhat like a network of neurons in the brain, even though much more simpler: in this sense, there are several checkpoints (the neurons), connected with one another, that receive and pass the information if they reach an activation threshold, exactly as it happens with the action potential of a real neural cell.

0b. For the AI Scientists

  • Protein: Proteins are biomolecules of varying size, made up by little building blocks known as amino acids. They are the factotum of a cell: if you are to imagine a cell as a city, proteins actually represent the transportation system, the communication web, the police, the factory workers… A protein has a primary (flat chain), secondary (mostly 3D but sparse) and tertiary (3D and ordered) structure.
  • Ligand: A ligand is something that binds something else: in the context of proteins, it can be a neuro-hormonal signal (like adrenaline) that binds its receptor.
  • Nucleic Acids: Nucleic acids (DNA and RNA) are the biomolecules that contain the information about the living system: they are written in a universal language, defined by their building blocks (the nucleotides), and they can be translated into proteins. Thinking of the city example we made before, they could be represented as the Administration Service of it. Nucleic acids often interact with proteins.

1. The diffusion architecture

For diffusion we mean that application of generative AI that is able to create images from a text prompt. The idea behind diffusion is perfectly suitable for the problem of protein structure prediction, as it is a text-based task: indeed, even though the 3D structure of a protein could seem completely unrelated to its 1D amino-acidic chain, there is actually a stronger link than anyone could think of. At the end of the day, indeed, all of the 3D interactions among amino-acids are already defined by their order in the primary chain.

The diffusion architecture in AlphaFold3 receives raw atom coordinates, meaning that, after the first prediction steps coming from a set of neural networks blocks (similar but not equal to those of AlphaFold2), the model is able to turn a “fuzzy” image, with lots of positional and stereochemical noise, to a well-defined and sharp structure. The big advantage of the diffusion model is that it is able to predict the local structure even if the upstream network is not sure about the correct amino-acidic coordinates: this is achieved thanks to the generative process, which is able to produce a distribution of answers that capture most of the possible variability in the protein structure.

As every generative model, also AlphaFold3 diffusion one is prone to hallucination: this is particularly true when it comes to unstructured regions of a protein (that lack a defined and stable tertiary structure), and the AlphaFold3 diffusion blocks are trained in such a way that, in those regions, they produce randomly coiled chains of amino-acids, as done by AlphaFold-Multimeter v2.3 (which generated the images used for hallucination correction training).

2. New tasks and better accuracy

As reported in the abstract, AlphaFold now outperforms task-specific softwares for:

  • Protein-Nucleic Acid interaction
  • Protein-Ligand interaction
  • Antigen-Antibody interaction

Why are these three tasks so important to us?

  • Proteins commonly interact with DNA and RNA: as reported by Cozzolino et al. (2021), these interactions “affect fundamental processes such as replication, transcription, and repair in the case of DNA, as well as transport, translation, splicing, and silencing in the case of RNA”. All of these are key cellular functionalities, that, if disrupted, can cause serious diseases. Moreover, understanding how proteins bind DNA and RNA can be really useful in genome editing (CRISPR-Cas9 is actually an RNA-protein-DNA system) and in the fight against bacteria and anti-microbial resistance (lots of the antimicrobial resistance depends on protein-DNA interaction that activates a specific gene which makes the bacterium resistant to the antibiotic).
  • Protein-Ligand interaction is key in drug design: up to now, we used “docking”, which means that we simulated the interactions between certain molecule types and proteins by re-iterating those interactions with slightly different chemical structures and positions. Needless to say, this is time-consuming and computationally intense, and AlphaFold3 can definitely improve these aspects, while also retaining a higher accuracy.
  • Antigen-Antibody interaction is the process with which some protein produced by our immune system (antibody) bind with foreign or mutated, potentially harmful, molecules: it is one of the methods with which pathogens can be found and eliminated. Predicting these interactions is key in understanding the immune system responses to certain pathogens, but also to something we want to introduce in our body in order to cure it. It also plays an incredibly important role in tumoral cell recognition, as tumoral cell may have some slight modifications of their cell-specific antigen that is not recognized as a threat by our immune system, but can be identified (and thus potentially treated) thanks to computational means.

What are the limitations?

As the authors of the paper reported, they are aware of three big limitations:

  1. Difficulties in predicting chirality: it is an intrinsic property of a molecule that deals with how the molecule rotates polarized light. Two molecules that differ for nothing but chirality are like your hands: they are perfectly similar, but you can’t superpose them palm to back. Even though some chirality penalty has been introduced, the model still produces about 4% of chirality violating proteins.
  2. Clashing atoms: there is a tendency, especially with >100 nucleotides nucleic acids interacting with >2000 amino acids proteins, to overlap atoms in the same space region (which is not actually possible).
  3. Hallucinations, as discussed before, can still happen, so an intrinsic ranking system has been introduced to help the model trashing the hallucinated structures.
  4. There are still some tasks, such as Antigen-Antibody prediction, where AlphaFold3 can improve. The authors observed improvements when the diffusion models is given more seeds (up to 1000), i.e. a series of numbers that “instruct” the model on how to generate an image, whereas no substantial advancement with more stable diffusion samples.
  5. As for all protein-prediction models, proteins are predicted in their “static” form, and not “in action”, when they are dynamically inserted into a living cell.

Conclusion and open questions

AlphaFold3 definitely represents a breakthrough in Protein Sciences: still, we are not at an arrival point.

This model marks the kick-off of the new generative AI approach to complex biological problems, which we also saw with OpenCRISPR: on one hand, this holds incredible potential but, on the other, the risk is that we are going to decrease the explainability of our models, leaving scientist with some auto-generated accuracy metrics that are not necessarily able to tell them why a protein has a certain structure.

Another really important topic is that AlphaFold3 is not completely open-source: there is an online-server provided by Google, but the code, as stated in the paper, is not given (except for some mock code that simulates the architecture). This poses a big ethical question: are we sure that we want a world were the access to advanced scientific tools is protected by strict licenses and not everyone can see what is actually going on in softwares by accessing their code?

And, more importantly now than ever, we must ask ourselves: are we really going to rely on non fully open-source AI to design our drugs, deliver targeted genome editing and cure diseases?

References

  • Abramson, J., Adler, J., Dunger, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature (2024). https://doi.org/10.1038/s41586-024-07487-w
  • Cozzolino F, Iacobucci I, Monaco V, Monti M. Protein-DNA/RNA Interactions: An Overview of Investigation Methods in the -Omics Era. J Proteome Res. 2021;20(6):3018-3030. doi:10.1021/acs.jproteome.1c00074
Read More

Vector databases explained

2024-05-07

Understanding the problem

Let’s say you want to set up a book store with ten thousand books, and you have the necessity to classify them in order to make them easily and readily available to your customers.

What would you do? You could decide to order them based on their title, the name of their author or their genre, but all of these approaches come with limitations and can impoverish the customers’ experience.

The best way to catalogue your books would be to give them a unique set of indices, based on their features (title, author first and last name, theme…): each of the book would then be stored on the shelves labelled with its own, original identifier.

Whoever wanted to search the store would be able to do it quickly, simply by accessing progressively smaller subsets of the books catalogue, until they reach the book of interest: this is not based only on the title or only on the author, but on a combination of keys that we are able to extract from the (meta)data associated with the books.

The idea behind vector databases is the same: they can be used to represent complex data, with lots of features, based on a set of multi-dimensional numeric objects (vectors). In this sense, we can collapse the information (contained in long texts, images, videos or other data) into numbers, without actually losing most of it but, at the same time, easing the access to it.

How is the database created?

The first technical challenge we encounter with vector databases is transforming non numerical data into vectors, which are actually made up of numbers.

The extraction of features is generally achieved with an encoder: this is a checkpoint that can exploit several techniques, such as neural networks, traditional machine learning methods, hashing or other linear mapping procedures.

The encoder receives texts, images, videos or sounds, already variously preprocessed (e.g., subdivided in smaller batches), and is trained to recognized several patterns, structures and groups, compressing them into numeric representations that get piped into the vector.

The vectors can come along with metadata associated to the raw data: the whole object that is loaded to the vector database along with the vector is called “payload”.

_config.yml

It is not unusual that vector database providers employ quantization techniques, such as binary quantization, to speed up and lighten the memorization process: quantization is further compression of the information associated with data, rescaling it according to some rules. For example, the previously mentioned binary quantization works as follows: everything below a certain threshold (let’s say 0, for the sake of simplicity) is mapped to 0, everything above is mapped to 1; in this sense, a vector like: [0.34, -0.2, -0.98, 0.87, 0.03, -0.01] becomes: [1, 0, 0, 1, 1, 0].

In general, after having been encoded, loaded and (optionally) quantized, vectors are indexed, which means their similarity to already-existent vectors is computed and they are arranged into “semantic groups”: going back to our book store example, this is the same as putting on the same shelf all the fantasy books whose title starts with “A” and whose author’s first name is “Paul”.

The similarity can be computed with several techniques, such as:

  • L2 (Euclidean) distance: takes into account the linear distance of two points on a plain
  • L1 (Manhattan) distance: accounts for the sum of the projections on the axes of the line segments between two points on a plain
  • Cosine: represents the cosine of the angle between two vectors on a plain
  • Dot product: it’s the product of the module of two vectors, multiplied by the cosine of the angle between them
  • Hamming distance: counts how many changes would it take to one vector to become like the other.

Now we have our data nicely vectorized and arranged into smaller, semantically similar subsets: time to use this database!

How is the database searched?

The database is generally searched with similarity-based techniques, which compute the degree of identity between two vectors (or among more) and retrieve the most similar N vectors (with N specified to the search algorithm).

The idea behind that is very simple abd can be visualized as follows:

_config.yml

The query from the user gets transformed into a vector by an encoder, and is then compared to the already indexed database (the points on the xy plain are subdivided according to colors and positions): instead of comparing with all vectors and finding the most similar one, our query vector is readily paired with its most similar semantic group and then the N best-fitting data points are sorted and (optionally) filtered according to a pre-defined metadata filter. The result of the search is then returned to the user.

Similarity search is tremendously different from traditional keyword-based search: as a matter of facts, similarity search relies on a semantic architecture, which involves that two words like “mean” (noun) and “average” are highly similar but “mean” (verb) and “mean” (noun) are not. If you search for “the definition of ‘mean’ and ‘median’” using a traditional key-based search, chances are that you get something like: “The meaning is the definition of a word”, which is completely irrelevant to your original query. On the other hand, a semantic search “understands” the context, and may retrieve something like: “Mean (statistics): The average value of a set of numbers, calculated by summing all the values and dividing by the number of values”.

What are the most important use cases?

The most important use cases can be:

  • RAG (Retrieval Augmented Generation): this is a technique employed to get more reliable responses from LLMs. It is not unusual that language models hallucinate, providing false or misleading information: you can build a vector database with all the relevant information you want your model to know and query the database right before feeding your request to the LLM, providing the results from your vector search as a context… This will remarkably improve the quality of the AI-generated answers!
  • Image search and match, that can be useful to identify people, places, molecular structures, tumoral masses…
  • Efficient storing and search among video and audio files: you could simply give a fragment of a song and get the highest-score matches back from the search results.

Conclusion

In a word where data throughput is skyrocketing, organizing them in an ordered, easy-to-access and fast retrieval friendly way will become a critical task in the near future.

Vector databases will probably prove as the best match to tackle the challenges of the huge load of data we will have to manage, so learning how they work and what services you can rely on to store and access your data is fundamental.

Here is a list of interesting vector database services (with the descriptions they provide of themselves):

  • Qdrant: “Powering the next generation of AI applications with advanced and high-performant vector similarity search technology”
  • ChromaDB: “Chroma is the open-source embedding database. Chroma makes it easy to build LLM apps by making knowledge, facts, and skills pluggable for LLMs.”
  • Pinecone: “The vector database to build knowledgeable AI”
  • Weavite: “Weaviate is an open source, AI-native vector database that helps developers create intuitive and reliable AI-powered applications.”

There are, of course, many other providers, so do not stop exploring the world out there for new and better solutions!

References

  • Leonie Monigatti, A Gentle Introduction To Vector Databases, 2023, https://weaviate.io/blog/what-is-a-vector-database#vector-search
  • Wikipedia, Vector database, https://en.wikipedia.org/wiki/Vector_database
  • Microsoft Learn, What is a vector database?, 2024, https://learn.microsoft.com/en-us/semantic-kernel/memories/vector-db
  • Sabrina Aquino, What is a Vector Database?, 2024, https://qdrant.tech/articles/what-is-a-vector-database/
  • Pavan Belgatti, Vector Databases: A Beginner’s Guide!, 2023, https://medium.com/data-and-beyond/vector-databases-a-beginners-guide-b050cbbe9ca0
Read More

The most worrisome health challenges according to LLMs

2024-05-07

What will the most worrisome global health challenge be in the next 30 years?

I asked ten Large Language Models the following question: “What will the most worrisome global health challenge be in the next 30 years?”. In this article, you can find their answers: if you do not wish to read the whole article, here is a summary of the most common challenges identified by all these models:

  • Antimicrobial resistance
  • Climate change
  • Non communicable diseases
  • Pandemics
  • Inequalities (both in health and in society)
  • Mental health
  • Rising technologies ethical issues

I strongly advise you read at least the first and the last responses, as they are the most insightful ones.

_config.yml

GPT4 Turbo

Predicting the most significant global health challenges over the next 30 years involves considering current trends, emerging threats, and the changing global landscape. Here are several potential contenders:

  1. Antimicrobial Resistance (AMR): As bacteria, viruses, and fungi evolve to resist existing treatments, AMR could become a formidable challenge, leading to a situation where common infections and minor injuries could once again become deadly. The World Health Organization (WHO) has already flagged AMR as one of the top 10 global public health threats facing humanity.

  2. Climate Change: Health impacts related to climate change are expected to intensify, including heat-related illnesses, respiratory issues from poor air quality, and the spread of vector-borne diseases like malaria and dengue fever. Climate change also impacts food security and water availability, which in turn can lead to malnutrition and poor health.

  3. Pandemics: The COVID-19 pandemic highlighted the global vulnerability to emerging infectious diseases. Future pandemics could arise from other novel viruses or known threats that evolve in new ways. The increasing interaction between humans and wildlife, along with global travel and urbanization, heightens this risk.

  4. Non-communicable Diseases (NCDs): Diseases like diabetes, heart disease, and cancer are projected to increase, particularly in low- and middle-income countries. These diseases strain healthcare systems and are influenced by aging populations, lifestyle changes, and possibly by environmental factors.

  5. Mental Health: Mental health issues are becoming increasingly recognized as critical components of global health. Issues like depression and anxiety impose significant disease burdens and are affected by numerous factors from societal changes to economic instability.

  6. Health Inequities: Disparities in health outcomes between different groups within and across countries are likely to persist or worsen unless addressed. These disparities often arise from inequalities in social determinants of health such as income, education, employment, and access to healthcare.

  7. Technological and Biosecurity Risks: Advances in technology such as synthetic biology and artificial intelligence offer tremendous potential benefits but also come with risks. Misuse, accidents, or unintended consequences could lead to new health crises.

Addressing these challenges will require coordinated global efforts, including investment in research, healthcare infrastructure, disease surveillance, and public health initiatives. Effective leadership and governance will also be critical in navigating the complex interplay of factors that influence global health.

REKA-CORE

Predicting the most worrisome global health challenge over the next 30 years involves considering several factors, including emerging diseases, climate change, antibiotic resistance, and changes in population demographics. Here are a few potential candidates:

  1. Antimicrobial Resistance (AMR): The overuse and misuse of antibiotics, antivirals, and antifungals have led to the emergence of resistant strains of bacteria, viruses, and fungi. This could render many current treatments ineffective, leading to longer hospital stays, higher healthcare costs, and increased mortality rates from infectious diseases.

  2. Climate Change and Health: Climate change is expected to have a profound impact on global health, exacerbating existing health challenges and creating new ones. This includes the spread of vector-borne diseases (like malaria and dengue fever) into new areas, increased frequency of heatwaves leading to heat stress and dehydration, and more severe natural disasters affecting public health infrastructure.

  3. Non-Communicable Diseases (NCDs): As populations age and lifestyles change, NCDs such as cardiovascular diseases, cancer, diabetes, and chronic respiratory diseases are likely to become an even greater burden. These diseases are often linked to modifiable risk factors like diet, physical activity, tobacco use, and alcohol consumption, making prevention a complex challenge.

  4. Emerging Infectious Diseases: The emergence of new infectious diseases (like COVID-19) and the re-emergence of previously controlled diseases pose ongoing threats. Global connectivity and population density can facilitate the rapid spread of diseases, making early detection, containment, and response critical.

  5. Mental Health: The global burden of mental health disorders is expected to increase, influenced by factors such as social isolation, economic stress, and the impacts of climate change. Addressing mental health will require integrated approaches that go beyond traditional healthcare systems.

  6. Access to Healthcare and Health Inequities: Global disparities in access to healthcare services, quality of care, and health outcomes are likely to remain a significant challenge. Ensuring equitable access to healthcare, especially in low-resource settings, will be crucial for addressing global health needs.

Each of these challenges presents unique difficulties and will require coordinated global efforts, innovative solutions, and significant investment to mitigate their impact on public health.

yi-34b-chat

Predicting the most worrisome global health challenge for the next 30 years is challenging due to the complex and dynamic nature of health threats. However, based on current trends and the potential impact of various factors, several issues stand out as likely to have a significant impact on global health:

  1. Climate Change: Climate change is expected to exacerbate many health challenges, including the spread of vector-borne diseases, food and water insecurity, and heat-related illnesses.

  2. Antimicrobial Resistance (AMR): The increasing resistance of bacteria, viruses, and parasites to existing drugs is a growing concern, potentially leading to a post-antibiotic era where common infections become untreatable.

  3. Non-Communicable Diseases (NCDs): With the rise of chronic diseases like heart disease, cancer, diabetes, and respiratory diseases, the burden of NCDs is expected to increase, particularly in low- and middle-income countries.

  4. Infectious Diseases: Emerging and re-emerging infectious diseases, such as COVID-19, Ebola, Zika, and multi-drug resistant tuberculosis, can rapidly become global health emergencies.

  5. Mental Health: The prevalence of mental health conditions is increasing, and the impact of social, economic, and environmental changes is likely to exacerbate this trend.

  6. Health Disparities: The gap in health outcomes between different populations, often linked to social determinants of health, is a persistent challenge that is likely to persist or worsen due to economic and social inequalities.

  7. Healthcare Access and Affordability: As the population ages and the demand for healthcare increases, ensuring access to affordable, quality healthcare services will become increasingly critical.

  8. Global Health Security: The ability of countries to prevent, detect, and respond to health emergencies will be crucial in an interconnected world.

  9. Emerging Technologies: The integration of new technologies into healthcare, while potentially transformative, also raises ethical, legal, and social issues that need to be addressed.

  10. Urbanization and Environmental Health: The health impacts of rapid urbanization, including air pollution and inadequate urban planning, will disproportionately affect vulnerable populations.

To prepare for and address these challenges, it will be essential for governments, international organizations, researchers, healthcare providers, and communities to work together to develop and implement sustainable health policies, strategies, and interventions. Adaptation, resilience, and innovation will be key in facing the evolving health threats of the future.

Gemini 1.5 PRO

Predicting the Most Worrisome Global Health Challenge in the Next 30 Years

Predicting the future of global health is a complex task, as it involves considering various interacting factors like evolving diseases, technological advancements, climate change, and socioeconomic trends. However, based on current knowledge and trends, here are some potential candidates for the most worrisome global health challenges in the next 30 years:

1. Antimicrobial Resistance (AMR):

  • The Problem: The overuse and misuse of antibiotics have led to the emergence of drug-resistant bacteria, making infections increasingly difficult to treat. This could potentially lead to a post-antibiotic era where common infections become life-threatening.
  • Why it’s Worrisome: AMR poses a significant threat to global health security, potentially undermining modern medicine and leading to increased mortality and morbidity from previously treatable infections.
  • What can be done: Strategies to combat AMR include promoting responsible antibiotic use, investing in research and development of new antibiotics and alternative therapies, and strengthening surveillance systems.

2. Emerging and Re-emerging Infectious Diseases:

  • The Problem: Factors like climate change, globalization, and ecological disruption increase the risk of new infectious diseases emerging and old ones re-emerging. Examples include zoonotic diseases (diseases transmitted from animals to humans) like COVID-19 and vector-borne diseases like Zika and Dengue.
  • Why it’s Worrisome: These diseases can cause large-scale outbreaks and pandemics with significant health and economic consequences.
  • What can be done: Strengthening global surveillance systems, investing in early detection and rapid response mechanisms, and promoting research on vaccines and treatments are crucial.

3. Non-Communicable Diseases (NCDs):

  • The Problem: NCDs like heart disease, cancer, diabetes, and chronic respiratory diseases are already the leading causes of death globally. Their prevalence is expected to rise further due to aging populations, unhealthy lifestyles, and environmental factors.
  • Why it’s Worrisome: NCDs place a significant burden on healthcare systems and economies, impacting individuals, families, and societies.
  • What can be done: Promoting healthy lifestyles, implementing preventive measures, and ensuring access to affordable and quality healthcare services are essential to address NCDs.

4. Climate Change and Health:

  • The Problem: Climate change impacts health directly through extreme weather events and indirectly by influencing factors like air pollution, water quality, and food security.
  • Why it’s Worrisome: The health effects of climate change are expected to worsen, disproportionately affecting vulnerable populations and exacerbating existing health inequities.
  • What can be done: Mitigation and adaptation strategies are crucial. This includes reducing greenhouse gas emissions, investing in climate-resilient healthcare systems, and supporting communities to adapt to climate change impacts.

5. Health Inequities:

  • The Problem: Unequal access to healthcare services and disparities in health outcomes persist within and between countries.
  • Why it’s Worrisome: Health inequities lead to unnecessary suffering and preventable deaths, hindering social and economic development.
  • What can be done: Addressing social determinants of health, strengthening health systems, and promoting universal health coverage are essential to reducing health inequities.

In conclusion, while it’s difficult to predict with certainty, the next 30 years will likely see a complex interplay of these challenges. Addressing them will require a multi-faceted approach involving collaboration across sectors, sustained investment in research and innovation, and a commitment to global health equity.

CommandR Plus

It is difficult to predict with certainty what the most worrisome global health challenge will be in the next 30 years, as the field of health and healthcare is constantly evolving and new challenges can emerge unexpectedly. However, based on current trends and ongoing issues, some potential global health challenges that may cause concern in the next 30 years include:

  • Infectious diseases: The emergence and spread of new infectious diseases, such as COVID-19, could pose a significant threat to global health. As the world becomes more interconnected and global travel becomes easier, the risk of disease outbreaks spreading rapidly across borders increases. Additionally, the rise of antibiotic resistance and the evolving nature of pathogens could make treating and controlling infectious diseases more challenging.

  • Non-communicable diseases: Non-communicable diseases, such as heart disease, stroke, cancer, and diabetes, are already the leading causes of death and disability worldwide. In the next 30 years, the global burden of these diseases is expected to increase as populations age and risk factors such as obesity and physical inactivity become more prevalent. Addressing non-communicable diseases will require a multifaceted approach, including prevention, early detection, and improved access to healthcare.

  • Mental health: Mental health issues, such as depression, anxiety, and substance use disorders, are already a significant burden on global health, and their impact is expected to grow in the coming decades. Stigma and lack of access to mental health services are ongoing challenges that will need to be addressed to improve mental health outcomes.

  • Climate change: Climate change is expected to have significant impacts on global health in the coming decades. Extreme weather events, such as heat waves, floods, and droughts, can directly cause injuries and deaths, as well as contribute to the spread of infectious diseases and food and water insecurity. Additionally, climate change may lead to population displacement and migration, which can further strain healthcare systems and resources.

  • Healthcare access and inequality: Inequalities in access to healthcare are likely to persist and potentially worsen in the next 30 years, as the cost of healthcare continues to rise and healthcare resources become increasingly strained. This could lead to disparities in health outcomes between different populations and regions.

  • Emerging technologies: The rapid advancement of technology, such as artificial intelligence and gene editing, could also present new challenges and ethical dilemmas in the field of healthcare. While these technologies have the potential to revolutionize healthcare, they also raise complex questions about privacy, equity, and the potential for misuse.

It is important to note that these are potential challenges and the future of global health will depend on a variety of factors, including advancements in medical research, public health initiatives, and global collaboration. By identifying and addressing these challenges now, we can work towards building a healthier and more resilient future for all.

GPT-3.5

Predicting the most worrisome global health challenge over the next 30 years is complex and uncertain, as it can be influenced by various factors such as emerging diseases, environmental changes, antimicrobial resistance, and social disparities. However, some experts suggest that antimicrobial resistance could become a significant threat to global health in the coming decades. This is due to the overuse and misuse of antibiotics leading to the development of drug-resistant strains of bacteria, making infections more difficult to treat. Addressing this issue will require coordinated efforts from governments, healthcare providers, and the public to develop new treatment strategies and promote responsible antibiotic use.

im-also-a-good-gpt2-chatbot

Predicting the most worrisome global health challenge over the next 30 years is complex due to the dynamic nature of global health issues. However, several key challenges stand out based on current trends and projections:

Climate Change and Health Impacts

  • Extreme Weather Events: Increased frequency and severity of heatwaves, hurricanes, floods, and droughts leading to direct health impacts.
  • Vector-borne Diseases: Expansion of habitats for mosquitoes and ticks could increase the spread of diseases like malaria, dengue, and Lyme disease.
  • Food and Water Security: Changes in agricultural productivity and water availability could exacerbate malnutrition and waterborne diseases.

Antimicrobial Resistance (AMR)

  • Superbugs: The rise of drug-resistant bacteria, viruses, and fungi could render many existing treatments ineffective, making infections harder to treat and leading to higher morbidity and mortality rates.
  • Global Spread: International travel and trade could facilitate the rapid spread of resistant pathogens.

Pandemics and Emerging Infectious Diseases

  • Novel Pathogens: The risk of new zoonotic diseases spilling over into human populations remains high due to habitat encroachment, wildlife trade, and intensive agriculture.
  • Preparedness and Response: Despite lessons from COVID-19, challenges in global coordination, equitable vaccine distribution, and health infrastructure remain.

Non-Communicable Diseases (NCDs)

  • Chronic Diseases: Rising rates of heart disease, diabetes, cancer, and respiratory diseases due to aging populations, lifestyle factors, and urbanization.
  • Mental Health: Growing burden of mental health disorders, particularly anxiety and depression, exacerbated by socioeconomic stressors and social isolation.

Health Inequities

  • Access to Healthcare: Disparities in healthcare access between and within countries could worsen health outcomes.
  • Socioeconomic Determinants: Poverty, education, and inequality will continue to impact health outcomes disproportionately.

Technological and Ethical Challenges

  • Healthcare Technology: Balancing the benefits of digital health, AI, and genomic medicine with ethical concerns and equitable access.
  • Data Privacy: Protecting patient data while harnessing big data for health insights.

Conclusion

Among these, climate change and its health impacts seem to be the most worrying due to their pervasive and compounding effects on other health challenges. For instance, climate change can exacerbate infectious disease outbreaks, impair food and water security, and increase the prevalence of chronic diseases through environmental stressors.

What Can Be Done?

  • Global Collaboration: Strengthening international cooperation and institutions for coordinated responses.
  • Investment in Health Systems: Building resilient health infrastructure.
  • Research and Innovation: Fostering technological advancements and equitable access to treatments.
  • Public Health Policies: Implementing proactive policies that address climate change, AMR, and emerging diseases.

Ultimately, the global health landscape will require adaptive, interdisciplinary, and collaborative approaches to navigate the challenges of the next 30 years.

Gemma-7B v1.1

1. Climate Change and its Health Impacts:

  • Rising temperatures, extreme weather events, and air pollution will lead to increased heat-related illnesses, respiratory problems, mental health disorders, and displacement of populations.

2. Infectious Disease Emergence and Evolution:

  • Increased antimicrobial resistance, globalization, and wildlife-human contact will facilitate the emergence and spread of novel infectious diseases.
  • Rapidly evolving pathogens and changing disease patterns will pose ongoing threats to global health security.

3. Chronic Diseases:

  • Aging populations, unhealthy diets, and sedentary lifestyles will contribute to a surge in chronic diseases such as heart disease, cancer, and diabetes.
  • The burden of these diseases will strain healthcare systems and economies.

4. Mental Health and Substance Abuse:

  • Increasing rates of depression, anxiety, and substance use disorders will pose significant health and social challenges.
  • Access to mental health services and support will be a critical need.

5. Health Equity and Access:

  • Disparities in healthcare access and outcomes will widen, leading to increased health inequities.
  • Ensuring equitable access to quality healthcare will be a major global priority.

6. Antimicrobial Resistance:

  • Continued overuse and misuse of antibiotics will accelerate the rise of antimicrobial resistance, leading to treatment failures and increased mortality.

7. Urbanization and Population Growth:

  • Rapid urbanization and population growth in developing countries will strain healthcare systems and infrastructure.
  • Air pollution, noise pollution, and overcrowding will contribute to health problems.

8. Occupational Health and Safety:

  • Automation and technological advancements will create new jobs but also expose workers to new health risks.
  • Ensuring safe and healthy working environments will be essential.

9. Social and Economic Factors:

  • Economic inequality, social unrest, and political instability can negatively impact health outcomes.

10. Emerging Technologies:

  • Rapid advancements in biotechnology and digital health will bring new opportunities but also potential risks to health privacy and access.

Qwen-32B-chat

As an AI language model, I cannot predict the future with certainty, but based on current trends and expert opinions, there are several global health challenges that could be particularly worrisome in the next 30 years:

  1. Antimicrobial Resistance (AMR): The overuse and misuse of antibiotics have led to the emergence of drug-resistant bacteria, viruses, and other pathogens, making it harder to treat infections. AMR is already a significant problem, and if not addressed, it could lead to a global health crisis.

  2. Non-Communicable Diseases (NCDs): Chronic conditions such as heart disease, cancer, diabetes, and respiratory diseases are on the rise globally and are expected to continue to be a major health challenge. They are often linked to lifestyle factors such as poor diet, lack of exercise, and smoking.

  3. Pandemics and Emerging Infectious Diseases: The COVID-19 pandemic has demonstrated how quickly a new virus can spread globally, causing significant morbidity and mortality. Other infectious diseases, such as Ebola, Zika, and Nipah virus, have also emerged in recent years, highlighting the need for better preparedness and response.

  4. Climate Change: Climate change is likely to have significant impacts on global health, including the spread of infectious diseases, increased air pollution, and extreme weather events that can disrupt health systems and exacerbate existing health disparities.

  5. Mental Health: Mental health conditions are already a significant burden globally, and the stress and isolation caused by factors such as the COVID-19 pandemic, economic instability, and social unrest are likely to exacerbate this challenge.

It’s important to note that these challenges are interconnected and may have compounding effects on global health in the coming decades. Addressing them will require collaboration, innovation, and sustained investment in public health infrastructure and research.

Llama-3 70B instructed

What a timely and important question! Based on current trends, research, and expert opinions, I’ll outline some of the most pressing global health challenges that may emerge or continue to worsen over the next 30 years. While it’s difficult to pinpoint a single most worrisome challenge, I’ll highlight a few that could have significant impacts:

  1. Antimicrobial Resistance (AMR): The rise of antibiotic-resistant bacteria, viruses, and fungi may lead to a post-antibiotic era, where common infections become untreatable. This could result in millions of deaths, economic losses, and a significant burden on healthcare systems.
  2. Climate Change and Health: Climate change will likely exacerbate health issues like heat stress, vector-borne diseases (e.g., malaria, dengue), and water scarcity. It may also increase the spread of infectious diseases, as changing weather patterns alter ecosystems and facilitate the movement of disease-carrying insects.
  3. Pandemics and Infectious Diseases: The COVID-19 pandemic has shown us how quickly infectious diseases can spread globally. Future pandemics, such as those caused by influenza, SARS, or MERS, could have devastating consequences. The increasing frequency and severity of outbreaks are exacerbated by factors like global travel, urbanization, and antimicrobial resistance.
  4. Non-Communicable Diseases (NCDs): The rising prevalence of NCDs, such as diabetes, heart disease, and certain types of cancer, will continue to strain healthcare systems and economies. The increasing burden of NCDs in low- and middle-income countries may worsen health disparities and limit economic growth.
  5. Mental Health: Mental health issues, including depression, anxiety, and substance abuse, are already a significant concern. The World Health Organization (WHO) estimates that mental health disorders will become the leading cause of disability globally by 2030, with significant economic and social implications.
  6. Global Health Inequality: The growing gap in healthcare access, quality, and outcomes between high-income and low-income countries may lead to increased health disparities, social unrest, and economic instability.
  7. Healthcare Systems Overwhelm: The increasing demand for healthcare services, combined with aging populations, may overwhelm healthcare systems, leading to reduced quality of care, longer wait times, and increased costs.
  8. Dementia and Neurological Disorders: The projected rise in dementia cases, particularly in low- and middle-income countries, will pose significant challenges to healthcare systems, families, and economies.
  9. Vaccine Hesitancy and Misinformation: The growing trend of vaccine hesitancy, fueled by misinformation and misconceptions, may lead to outbreaks of preventable diseases, undermining decades of progress in global health.
  10. Healthcare Workforce Shortages: The projected shortage of healthcare professionals, including doctors, nurses, and other essential workers, may compromise healthcare quality and access, particularly in low-income countries.

While these challenges are interconnected and complex, addressing them will require collaborative efforts from governments, international organizations, healthcare professionals, and individuals. By prioritizing global health investments, research, and innovation, we can mitigate the impact of these challenges and create a healthier, more equitable world for all.

Conclusion

Now that we have read through all these answers, there are still two questions that crawls in our minds: can AI be used to advise us on future challenges and suggest solutions? Can we leverage the power of Artificial Intelligence models in order to build a better future for the coming generations?

These interrogatives should concern all of us, as it is today’s commitment that defines tomorrow’s success: let’s be proactive and try to find solutions, then, so that we will be able to give our contribution toward the improvement of humanity and Planet Earth!

Read More

AI predicts proteins and edits DNA

2024-04-26

AI solves biology problems

Protein folding, a tough challenge

_config.yml

One of the longest standing-by problems of Life Sciences was protein folding.

For starters, folding is the process through which a protein is turned from a linear string of amino-acids to a three-dimensional working unit, suitable for carrying out all sorts of tasks in living cells.

This is a key problem not only because proteins are the fundamental building blocks of life, but also because misfolding (i.e. errors that occur while building the 3D structure) is at the heart of lots of human and animal diseases and abnormalities. Moreover, proteins are a target for lots of drugs (antibiotics, for example) and they can also be employed in a wide variety of industrial fields (most of clothes detergents contain enzymes to remove dirt and stains, e.g.).

In this sense, knowing the 3D structure of a protein thanks to a fast, reliable and scalable in silico method, such as a predictive algorithm, is much easier than exploiting slow and expensive laboratory procedures, such as crystallography (which is, nevertheless, still considered the golden standard for protein structure reconstruction).

For 50 years, 3D structure prediction has been a real struggle for bioinformatician, because no algorithm seemed to get accurate results, but then Google Deepmind team came, and the rules of the game changed.

Alphafold, a game changer

In 2020, AlphaFold 2 won the Critical Assessment of Structure Prediction - 14 (CASP14), a protein fold competition, with unprecedented results.

In this competition, participants are evaluated according to GDT, Global Distance Test: the score ranges from 0 to 100, with 100 meaning that all amino-acids in the predicted 3D structure of the protein are exactly where they are supposed to be, or within a threshold distance. Formally, a GDT of 90 is considered to be competitive with laboratory procedures: in 2020, AlphaFold 2 got a median score of 92.4 overall, with the lowest being around 87. Considered that no one, not even AlphaFold (base version), had surpassed 60 in the previous editions of CASP, that was a shockingly impressing result.

Now AlphaFold 2 can be used as a research tool by everyone: you can either run it within Colab (here) or you can search through the AlphaFold database, which encompasses more than 200,000,000 predicted structures: if you scrape through it long enough, you may also come across a protein structure that no one has ever seen!

So, now comes the big question: should we trust AI and rely on that for a task as delicate and critical as protein structure prediction? Share your thoughts in the comments below!

Genome editing: AI can modify human DNA

_config.yml

There’s a biotech startup that claims to have edited human DNA with the aid of AI: how comes that we are able to edit human DNA? How exactly AI can help in doing that? Let’s take one step at a time.

The CRISPR era

Background

Since the very first discover of DNA structure by Watson and Crick in 1953, the El Dorado of molecular biologists had been to “crack the code”: ingenerate modifications in the genetic sequence in order to change its features. The first attempts came in the 60s and 70s, were wheat plants were modified to be resistant to cold temperatures and parasites, even though these first GMO were obtained through very imprecise and coarse modifications based on some DNA absorption and/or horizontal DNA techniques already observed in bacteria (namely plasmids). Other paths that scientists used to achieve this goal were radiations, nanoparticle bombardment and viral vector transduction, but all these methods suffered from being either imprecise or highly expensive, so that no one could really be scaled on a wide-public production (one of the first medical treatment based on viral vectors for gene modification against spinal muscular atrophy, Zolgensma, costed around 2 million dollars), both for safety and financial concerns.

A revolutionary technique on the rise

Studying bacteria, scientists discovered, in 1987, that there were some intriguing repeated sequences in their DNA, which some unique and non-repetitive sequences were always found. An explanation for this characteristic came only twenty years later, when it was understood that prokaryotes use this sort of system as an “immune defense” against viruses (there is a wide class of viral agents that “eat” bacteria, known as bacteriophages).

To put it plain and simple, there was a family of proteins, the Cas family, which were able to recognize and chop the DNA of a viral invader based on previous encounters that the cell had with the same agent: in order to do so, they used a sort of “tracker”, which was actually the portion of the viral DNA they had to break. This portion was stored in the bacterial genome, and was accessible thanks to a special signal, which was actually constituted by one of the repeated sequences discovered in 1987. This system was then called CRISPR-Cas, with CRISPR meaning Clustered Regularly Interspaced Short Palindromic Repeats (luckily they found a crispy acronym for that!).

Here came the breakthrough in science: what if we implemented CRISPR-Cas system to target our genes and correct the errors that were in there? In the end, it only takes a protein (Cas9 for most of the purposes) and a guide sequence, and then the system can freely work (it is a little more complex than this, but bear with me for today’s article).

CRISPR-Cas9 was actually implemented and used for gene editing, and its impact was so massive on today’s science that in 2020, Jennifer Doudna and Emmanuelle Charpentier, two CRISPR pioneers, were awarded with the Nobel Prize in Chemistry “for the development of a method for genome editing”.

Needless to say, this comes also with several ethic implications, that even turned in a sci-fi like scenario when, in 2018, He Jiankui, a Chinese biophysicist, used CRISPR-Cas9 to modify to human embryos at make them resistant to HIV.

AI comes into the play

Regardless of ethical problems that may come along with its implementation, CRISPR-Cas9 has proved as one of the most effective and trustworthy techniques to edit DNA, and lots of experiments and clinical trials are nowadays relying on that. Nevertheless, one of the biggest problem is the unsuitability of Cas protein for the editing target: proteins are indeed molecules with an highly complex 3D structure that not always fits everywhere.

In this sense, finding alternatives to Cas9 and to the whole editing scaffold structure to generate combinations that are a perfect fit for a given situation is a key problem for scientists now. Or, better, it was a key problem: Profluent, a biotech startup, has now aced an incredible results in predicting the sequence of Cas-like proteins using generative AI models.

In their paper “Design of highly functional genome editors by modeling the universe of CRISPR-Cas sequences” they explain how they did it:

  1. First, they mined data from over 26 TB of assembled microbial genomes, in order to get out more than 1 million of CRISPR operons. They collected all these data in the “CRISPR-Cas9 Atlas”
  2. They trained an AI model (OpenCRISPR-1) in order to generate Cas9-like protein sequences
  3. They tested the efficacy of the editing power for their generated proteins on a human cell lines
  4. They collected data from other proteins involved in the editing and designed a “fully synthetic base editor system” that would best suit the outputs from OpenCRISPR-1, and tested it on the human cell lines as before.

The first results look incredible and it seems that we are less that one step away from unlocking a new frontier of gene editing, which could be used for therapeutics, fighting the climate crisis, granting access to food for everyone, optimize agricultural production, eradicate disease-bearing insects… and many other applications.

The coolest part of this project is that it is completely open source: everyone can use OpenCRISPR as long as they sign Profluent license and terms of usage to ensure ethical application of the software for research and commercial purposes.

References

  • Design of highly functional genome editors by modeling the universe of CRISPR-Cas sequences Jeffrey A. Ruffolo, Stephen Nayfach, Joseph Gallagher, Aadyot Bhatnagar, Joel Beazer, Riffat Hussain, Jordan Russ, Jennifer Yip, Emily Hill, Martin Pacesa, Alexander J. Meeske, Peter Cameron, Ali Madani bioRxiv 2024.04.22.590591; doi: https://doi.org/10.1101/2024.04.22.590591
  • OpenCRISPR GitHub
  • A wonderful book about CRISPR: Editing Humanity: The CRISPR Revolution and the New Era of Genome Editing, by Kevin Davies (this is a spontaneous suggestion, not an advertisement)

Leave your thoughts!

What do you think about the impact of this kind of AI on health, society and environment? Is it dangerous to apply AI to genome editing?

Leave a comment below!

Read More

Transformers architecture for everyone

2024-04-25

Transformers revolution

Ever wondered what does GPT stand for? In case you did and you didn’t have time or weren’t willing to find an answer, here it is: Generative Pretrained Transfomer. Now: for Generative and Pretrained, we can easily think of an explanation, whereas “transformer” is not something we deal with in our everyday life (at least not that we are aware of). Let’s dive a little bit further in the subject, then, by exploring in plain and simple terms what are transformers and what could we do with them.

A new technique

Long before transformers were actually implemented, neural networks such as Recurrent or Long Short Term Memory ones were tested on tasks such as Machine Transduction (performing automated transformation on an input to generate a new output). There were, nevertheless, two big problems:

  • Vanishing gradient: the learning of the neural networks was not as stable as expected and large inputs (such as long sentences for Natural Language Processing pipelines) were not easily handled. Long-term memory was a big problem in this sense, even with LSTM neural networks.
  • Computationally intense training: the lack of parallelization in the training and the high number of required operation made the scaling of LSTMs or RNNs very difficult, so they couldn’t be implemented or shipped largely.

The transformers were able to solve these two issue, by employing several solutions that we will explore here, such as tokenization, vectorization, attention and normalization.

The transformer architecture

In this blog post, we will discuss the transformer architecture in a non-technical way, and we will refer to the one proposed by Vaswani et al. in their 2017 paper Attention is all you need.

Here you can find a visual representation, that we are going to break down in smaller pieces:

_config.yml

To understand everything, we’ll follow the path of the Italian sentence “Luca sta mangiando la torta” into our transformer, to translate it in English (“Luca is eating the cake”).

Encoder

The encoder is the first portion of the transformer: it turns natural language into a set of dense vectors of numeric values that can be handled by the decoder.

From input to input embedding

To feed our transformer with a sentence, we can’t pass over the natural language, as a computer doesn’t handle words like our brain does, it can only manage numbers. In this sense, we need to first tokenize and then vectorize our input, outputting what is called “embedding”.

The first step is then to subdivide our original sentence into batches, or tokens (let’s stick with a simple per-word batching: each word will represent a token): [“Luca”, “sta”, “mangiando”, “la”, “torta”].

From now on, we need to switch from letters to numbers, so let’s say we will embed these words as follows: [0.333, -1.456, 0.76, 0.89, 0.778] and turn them into the suitable vector or array-like object that our encorders takes as its real input.

Positional encoding and attention

As we can see, there is a precise order in the sentence: saying “Luca is eating the cake” is completely different from saying “The cake is eating Luca”. It is then key to encode not only the word per se, but also its position (this is particularly important in generative models for next-word/sentence prediction tasks). Generally, we do it by adding specific values to each word embedding, depending on its position.

In addition, we can see that there are internal relations among the words in the sentence (“is” is directly linked to “Luca” and not to “apple”, for example). How can we tell to the encoder that those words are connected to each other? Positional encoding itself is not enough, as it highlights an order, not a logic relation. Here it comes the key mechanism that sets transformers apart: attention.

In the attention process, each word (query) is compared to the others (values) using a set of keys that are defined for each sentence and that help the encoder in deciding what to focus on.

This works like a database querying system of a library: you search for some keywords that refer to the book you want to read, the query manager filters the request based on the keys according to which the database is organized and compares your search to a bunch of books that can be related to it, returning the most similar to your query. In this sense, the relation of each word with the other ones is evaluated and the higher the attentio output, the stronger is the relation between those words.

Feed forward neural network

Now that we have our “attentioned” word embeddings, we pass them through a feed forward neural network, which is generally a simple Multi-Layers Perceptron.

This helps extracting more feature from the embeddings, but there is no specific mechanism: each word embedding is processed following the same path.

Decoder

The decoder’s task is to turn word embeddings into probability scores that will be used by our model to:

  • Predict the next word/sequence
  • Translate a sentence
  • Recognize and caption image portions
  • Generate language or images
  • And many other things…

The decoder, starting from the encoder inputs, goes through similar steps, adding another attention layer, a feed forward neural network and a final linear activation function with softmax normalization (which means that each value in the final output vector is turned into a probability score, and all the probability scores add up to 1).

Breakdown of decoder output

For our starting example, let’s say, we will have a matrix of probabilties that map the original words to their probable English translation. Let’s say that “sta” has a 77% probability of being “is” and a 23% probability of being “stays”: it will be then translated into “is”. “Torta” has a 98% probability of being “cake” and a 3% probability of being “pastry”, so it will be translated into “cake”. The same goes for every word, and the final output will be: “Luca is eating a cake”.

Decoder-only and encoder-only models

Decoder-only (GPT)

We described an encoder-decoder model, but the most famous of all Large Language Models, GPT, is based on a decoder-only architecture.

In the case of a decoder-only model, the word input is turned into embeddings directly by the decoder and, to supply for the absence of an encpder, the first attention step is to “mask” words in the sentence when paying attention: some words are purposedly ignored in calculating attention, in order for the model to focus on relevant information rather then using all the available one. Other attention steps can follow, but the decoder will proceed as we already saw previously.

Encoder-only (BERT)

Another really famous model is BERT (Bidirectional Encoder Representations for Transformers): it is encoder-only, and its purpose is not to generate text (for which it would need the probability scores that come out of a decoder), but is to understand relations among words and texts (it is used for sequence, token and text classification, for example).

Conclusion

In this post, we’ve embarked on a journey to demystify the concept of transformers, a revolutionary technique in the field of Natural Language Processing. We’ve broken down the transformer architecture into its constituent parts, exploring the encoder and decoder components, and delved into the key mechanisms that make transformers tick, such as tokenization, vectorization, attention, and normalization.

By understanding how transformers work, we can appreciate the power and flexibility of models like GPT and BERT, which have achieved state-of-the-art results in a wide range of NLP tasks. Whether you’re a seasoned developer or just starting to explore the world of AI, the concepts and techniques discussed in this post will provide a solid foundation for further learning and exploration.

So, the next time you hear someone mention “transformers” or “GPT,” you’ll know that it’s not just a buzzword - it’s a powerful technology that’s changing the face of AI and NLP.

References

For a deeper dive into the transformer architecture, I recommend checking out the following resources:

  • Prashant Ram’s article on Medium, “Understanding the Transformer Architecture in AI Models” [1], which provides a concise introduction to the transformer architecture.
  • The Wikipedia article on the Transformer (deep learning architecture) [2], which offers a comprehensive overview of the transformer model.
  • Built In’s article on “Transformer Neural Network” [3], which provides a detailed yet brief explanation of the transformer.
  • Towards Data Science’s article on “How Transformers Work” [4], which explores the transformer architecture and its role in natural language processing.
  • Jean Nyandwi’s post on “The Transformer Blueprint: A Holistic Guide to the Transformer Neural Network Architecture” [5], which offers a detailed breakdown of the transformer architecture and its components.

These resources provide a wealth of information on the transformer architecture and its applications in AI and NLP.

[1] Prashant Ram on on Medium: “Understanding the Transformer Architecture in AI Models”. https://medium.com/@prashantramnyc/understanding-the-transformer-architecture-in-ai-models-e9f937e79df2

[2] Wikipedia: “Transformer (deep learning architecture)”. https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture)

[3] Utkarsh Ankit on Built In: “Transformer Neural Network”. https://builtin.com/artificial-intelligence/transformer-neural-network

[4] Giuliano Giancaglia on Towards Data Science: “How Transformers work”. https://towardsdatascience.com/transformers-141e32e69591

[5] Jean Nyandwi on AI research blog: “The Transformer Blueprint: A Holistic Guide to the Transformer Neural Network Architecture”. https://deeprevision.github.io/posts/001-transformer/

Read More