Conversate effortlessly in more than 50 languages!
View the Project on GitHub AstraBert/bloom-multilingual-chatbot
This logo was generated with CoderLogon, a Coze bot that generates logos for your GitHub repos, exploiting Pollinations AI API
…It does not yield the same high performances that you can get by querying it in English.
For non-native speakers this can represent an initial barrier, for two reasons:🚧
When English is not your first language, generating on-point questions that fully express what you mean can be hard, and it is not unusual that ChatGPT or other language model get confused about what you are asking for, at least with their first answers.🤔
On the other hand, when trying to speak with the LLM in your native language (especially if it is not well represented in the World-Wide-Web cultural products), you can bump into awkward phrasing, errors or difficulties in interpreting idioms and other everyday expressions.🤨
It would be great if we could generate a multilingual LLM from scratch, and Bigscience, for instance, is doing a lot in this direction with Bloom🌸.
Nevertheless, we can also decide to build upon already-existent English-based models, without finetuning or retraining them, but with a clever workaround: we can use a filtering function that is able to translate the user’s native language query in English, feeding it to the LLM and retrieving the response, which will be eventually back-translated from English to the original language.㊗️
Curious of trying? Let’s use some python to build it!🐍
To build a multi-lingual chatbot, you’ll need several dependencies, which you can install via pip
:
python3 -m pip install transformers==4.39.3 \
langdetect==1.0.9 \
deep-translator==1.11.4 \
torch==2.1.2 \
gradio==4.28.3
Let’s see what these packages do:
We need to build a back-end architecture that looks like this (realized with Drawio):
Let’s define a Translation
class that helps us with detecting the original language and translating it:
from langdetect import detect
from deep_translator import GoogleTranslator
class Translation:
def __init__(self, text, destination):
self.text = text
self.destination = destination
try:
self.original = detect(self.text) # detect original
except Exception as e:
self.original = "auto" # if it does not work, default to "auto"
def translatef(self):
translator = GoogleTranslator(source=self.original, target=self.destination) # use Google Translate, one of the fastest translators available
translation = translator.translate(self.text)
return translation
As you can see, the class takes, as arguments, the text we want to translate (text
) and the language we want to translate it into (destination
).
Let’s now load the LLM that we want to use for our purposes: we’ll start with Bigscience’s Bloom-1.7B, which is a medium-sized LLM and a good match for a 16GB RAM, 2-core CPU hardware.
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
model = AutoModelForCausalLM.from_pretrained("bigscience/bloom-1b7") # import the model
tokenizer = AutoTokenizer.from_pretrained("bigscience/bloom-1b7") # load the tokenizer
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=2048, repetition_penalty=1.2, temperature=0.4) # prepare the inference pipeline
We define a maximum number of generated tokens (2048), set the repetition penalty to 1.2 (fairly high) in order to avoid the model repeating the same thing over and over again, and we keep the temperature (“creativity” in generating the response) quite low.
Now, let’s create a function that is able to take a message from the chat, translate it to English (unless it is already in English), feed it as a prompt to Bloom, retrieve the English response and back-translate it into the original language:
def reply(message, history):
txt = Translation(message, "en")
if txt.original == "en":
response = pipe(message)
return response[0]["generated_text"]
else:
translation = txt.translatef()
response = pipe(translation)
t = Translation(response[0]["generated_text"], txt.original)
res = t.translatef()
return res
We have all we need for our back-end architecture, it is time to build the front-end interface!
With Gradio, building the user’s interface is as simple as one line of code:
demo = gr.ChatInterface(fn=reply, title="Multilingual-Bloom Bot")
Now we can launch the application with:
demo.launch()
And, imagining that we saved the whole script in a file titled chat.py
, to make the chatbot run we go to our terminal and type:
python3 chat.py
Then we patiently wait and head over to the local server link that Gradio will give us once everything is loaded and ready to work!
If you want to find the source code, go to the scripts folder.
Do you want to try what we just created? Make sure to visit this Hugging Face Space I built: as-cle-bert/bloom-multilingual-chat💻.
bloom-multilingual-chatbot
on your machinebloom-multilingual-chatbot is also available as a Docker image:
docker pull ghcr.io/astrabert/bloom-multilingual-chatbot:latest
You can then make it run with the following command:
docker run -p 7860:7860 ghcr.io/astrabert/bloom-multilingual-chatbot:latest
IMPORTANT NOTE: running the app within docker run
does not log the port on which the app is running until you press Ctrl+C
, but in that moment it also interrupt the execution! The app will run on port 0.0.0.0:7860
(or localhost:7860
if your browser is Windows-based), so just make sure to open your browser on that port and to refresh it after 1 to 5 mins (depending on your computer and network capacities), when the model and the tokenizer should be loaded and the app should be ready to work!
Another fundamental caveat is that we are dealing here with a relatively small language model (approx. 3GB), so the it is CPU-friendly (you can run it GPUless): to make the docker container work, indeed, 8GB RAM + 12 cores CPU can be enough, but language generation will be really slow.
You will need at least 16 to 32 GB RAM and/or a GPU to speed up the model.
If you like the idea, make sure to show your support by leaving a little ⭐ on GitHub!
If you please, support my open-source work by funding me on GitHub: in this way, it will be possible for me to improve my multilingual chatbot performances by hosting it on a more powerful hardware on HF.