With a significant spurt in Biological research, we have large and rich knowledge bases that are available online for analysis, hypothesis & referencing. However, the information in these vast databases is scattered and needs to be ‘Biocurated’ i.e., analyzed, bifurcated and assigned to relevant topics through manually generated or automated Knowledge Graphs (KGs). KG inherits the properties of Graph data base, like flexible schema, easy to write complex query, query performance, etc. Many Life Sciences projects and enterprises today hire skilled Biocurators who deliver curated data without compromising on its quality.
MANAV – The Human Atlas Initiative is one such project in which information at macro and micro-levels from a large knowledge base of life sciences literature and databases is getting curated. You can join the MANAV initiative for getting hands-on Biocuration experience under the guidance of domain experts.
Interested to know more? Read this blog to understand how to get industry ready for a ‘Biocurator’ profile with the help of the ‘MANAV Upskilling Flow’ (The process of student training on MANAV Platform).
Understanding what is Biocuration!
Biocuration is a process in life sciences dedicated to collation and organization of data, information, and knowledge in structured formats like Text, Excel sheets, Databases and KGs. This curated data is then used for different activities like strategizing a new research plan, drawing inference or drafting a research proposal.
The key role of a Biocurator is to collect and sort data from primary sources like research articles, publications, whitepapers, technical papers/notes and other related materials. Also, Biocurators are responsible for maintaining data quality which includes removing duplicate entries, replacing the old information with the updated one and removing or tagging unreferenced information and evidence. If you are a college or university student in the science stream, MANAV is an opportunity for you to learn a new skillset that can help you become a professional Biocurator!
Data Curation Process
The workflow for Biocuration is as follows:
Identifying research problem and related papers: In this process we must finalize the research problem and then search for the relevant articles.
Identifying a software system for date extraction: Choosing the right tools is important. Tools which allow to configure and integrate rules and values are preferred.
Device a process for data validation and quality check: Data quality can be measured in a manual way or through Machine Learning (ML) techniques. One can build ML models to check the data accuracy.
Data upload in database for further query and processing: Data can be stored in the form of Knowledge Graphs for future referencing.
Skilled Biocurators – The Need of the Hour
In order to conduct data collation and annotation for large life sciences datasets, we require individuals with specialized skillsets and proficiencies. The entire process of curation includes complex functions like data collection, intelligent interpretation, annotation, and structured storage of the biomedical information, which calls for skilled Biocurators.
Challenges Faced by Students for Becoming Biocurators
- Current growth in biological knowledgebases demands trained Biocurators to accelerate the hybrid annotation (both automated and manual). However, though there is an industry demand for Biocurators, a clear skill gap is observed because of lack of training and experience.
- In India, we don’t have a formal degree for Biocuration. Only a few academic institutions offer a certification in ‘Biocuration’. ELIXIR (the European Life-sciences Infrastructure for Biological Information) and GOBLET (Global Organization for Bioinformatics Learning, Education and Training) provide training and support for Biocuration. Also, the University of Cambridge & the EMBL-EBI are offering a Postgraduate Certificate in Biocuration.
- A major challenge exists in getting a process in place that helps the user to complete his/her journey from a Science Graduate to a Biocurator.
- There is a dire need for a system which offers predefined and customized courses at beginner as well as intermediate levels. This system should be designed such that it shall be able to assess a student’s improvement in Biocuration over a period of time.
Become a Biocurator through MANAV Platform & Upskill Yourself
The MANAV platform helps students to update their scientific knowledge as well as train themselves on various operating tools. Follow these 7 easy steps to become a Biocurator for the MANAV Platform & upskill yourself:
#Step 1 – To help students practice better, the MANAV platform has already tagged a few research articles for training purpose. To access these resources, students can register on the MANAV platform and start curating data.
#Step 2 – After keying-in your complete profile details, you can join a team of Biocurators.
#Step 3 – Once you join, a team owner will be assigned to you. Team owner is responsible for overall team management and to undertake important tasks like assigning research articles to students, allotting reviewers for the curated content and helping curators in case of any subject related query.
#Step 4 – To get the curation started, a student first goes to LMS. It’s a learning module that imparts insights on articles and best research practices. After completing the LMS module, students proceed for article annotation.
#Step 5 – MANAV has an inbuilt Knowledge Graph editor which allows students to read, mark and annotate important information from research articles. During this process, the students study and build manual Knowledge Graphs from assigned research articles.
#Step 6 – To ensure quality and efficiency of the Knowledge Graphs, students’ created KGs are then compared with existing Gold Standard KGs which are curated by domain experts.
#Step 7 – In the final process, students can see their KG comparison scores. Also, they can check their KGs v/s KGs created by expert. A student can receive commendation from an expert about his work.
The MANAV platform also provides a robust query interface to perform keyword-based search on its database. Students can easily run a query for their interest and get the result.
Gain Competitive Advantage with the MANAV Upskilling Flow
After completing the journey with ‘MANAV Upskilling Flow’ you will gain deeper insights in to:
- Best practices for reading and comprehending research papers
- Hands-on knowledge of the annotation tool
- Data interpretation and representation in the form of Knowledge Graphs
- Dynamic query interface that helps you build a complex query for any scientific problem
Upon successful completion, students will receive a certificate from MANAV. The MANAV platform is Open source and free for all. One can also deploy it in their custom environment. MANAV is not just limited to an individual’s improvement, in fact, one can build a team to tackle bigger scientific problems and work on them.
As the technology partner for MANAV, we have expertise in Knowledge Graph-based solutions for the Health Care & Life Sciences industry. We design & develop customized KG curation & evaluation platforms and methodologies. Our in-house built data mining tool can help train your resources, collate relevant information with context and cull out interesting use cases from large databases. If this interests you, our team will be happy to give you a live demo. To learn more about Persistent’s comprehensive Life Sciences Innovation & Engineering solutions, connect with us.