masakhane logo

Masakhane Web

  • Masakhane Web is an open source online machine translation service for solely African languages. This project is in line with the works of the Masakhane community. Masakhane meaning ‘we build together’, is a research effort whose mission is to strengthen and spur NLP research for African languages which is open source and online. So far, the community has trained translation models for over 38 African languages. As such, this platform aims at hosting the already trained machine translation models from the Masakhane community and allows contributions from users to create new data for retraining and improving the models.

  • You can find the deployed project here.

messages

Message Screener

  • The aim of the project is to create a tool that analyses a user’s message before sending. This will enable social media users to be more mindful of their posts to avoid unnecessary controvercy by flagging words, tone and topics that might be considered sensitive. The project contains the following product features:
    • Profanity screener (with blacklist)
    • Sentiment analyser
    • Topic Identifier

rocks

Rock Classifier

  • This project’s aim was to train a machine vision classifier that identifies five types of material classes for a large mining company in South Africa to allow automatic monitoring of the input bins for their funace. The image sampling system was already implemented and raw image data that is not classified was available. The purpose of this is to help the organization to reduce the material handling errors which leads to inefficiencies in the furnace and drives up production costs.

markus-winkler-unsplash

Twitter Sentiment Analysis using Streamlit

  • This is a work-in progress sentiment analysis app that’s built using Streamlit. The purpose of this app is to be able to get sentiments of tweets that are from twitter.
  • Future improvements:
    • getting average sentiment of tweets from a page or specific hashtags

justin-lane-unsplash

Swahili speech dataset: A Swahili speech dataset for low-resource speech tasks.

  • Approximately 2 hours of audio recordings from a native Swahili speaker.
  • Can be used for low-resource speech tasks

Zindi logo

Predicting Female-Headed Households in South Africa

  • The objective of this challenge was to build a predictive model that accurately estimates the % of households per ward that are female-headed and living below a particular income threshold.
  • You can find notebooks used here