Bioinformatics services on the cloud | Blog

Aug 08, 2014 Aathira Nair

Personalized medicine is all about treating a patient with the specific medicine suited to their body as determined by their gene makeup. Bioinformatics is an interdisciplinary field bringing in the experimental data from biological sciences and analyzing them using information technology tools, such as cloud servers, analytical technology tools etc. Publicly available genomic data and access to low-cost, high-throughput molecular technologies for profiling patient populations, computational technologies and informatics are becoming vital considerations in genomic medicine.

Cloud computing has been seen most influential in high-throughput sequence data analysis. With the volume of data multiplying every year, it is a daunting task for small and large laboratories to maintain and process data for these sequential analyses. Hadoop has been successfully used in bioinformatics as it meets the essential need of biological data analysis. Hadoop consists of two parts – MapReduce and Hadoop Distributed File System (HDFS).Employing these two parts, Hadoop can successfully solve large data problems by using technology infrastructure in a more efficient manner. Cloud-based analysis compares favorably in both performance and cost when compared to local computational clusters, showing that cloud computing technologies might be a viable options to facilitate large-scale translational research in genomic medicine.

The traditional method for bioinformatics was to download databases and software and then proceed to analyze the data at hand using the downloaded data with the software installed locally. Bioinformatics cloud utilization can vary depending on the need of the task.

Software as a Service – Bioinformatics uses various softwares, which can be very specific in its usage, for different types of data analysis. SaaS allows for access to softwares across remote locations without the hassles of IT intervention and set up requirements.

Platform as a Service – The environment delivered through PaaS has the programming language, web servers and databases which can be used to develop cloud based bioinformatics applications for data sequencing and analysis. There are many databases which are available for use, which are similar to PaaS, except that they do not allow for any application development, as in the case with PaaS.

Data as a Service – Knowledge discovery and downstream analysis requires access to the updated sequence data from across laboratories. Cloud storage and access on demand works best for these requirements, and there have also been attempts to scale database usage with sequence mapping and alignment services on the cloud.

Amazon AWS contains many public data sets which can be seamlessly integrated into any cloud application, and be used as and when required through a subscription model.

Infrastructure as a Service – To utilize the true potential of the cloud and to collaborate and work seamlessly, IaaS brings in different virtualized resources which can be used as and when required. Due to the rapid advancements in IT infrastructure, it is more sensible to work in Virtual Machines (VMs) which can be tweaked to the perfect resource requirement without wasting time and resources on IT.

Academic Cloud

Today only a small fraction of the large biological data generated is being stored in the cloud, and most sequencing and other bioinformatics tools which exist were developed for desktops, and not for the cloud. Added to it, the difficulty in uploading the large volume of biological data to the cloud is also what is keeping scientists from exploring the cloud computing route. High speed data transfer, peer to peer data distribution and many such options need to be evaluated to understand the best way to bring in IaaS and cloud for biological data sciences.

Creating a cloud space publicly available to academia and the research community will aid in building a strong value proposition for moving bioinformatics to the cloud. The potential resulting benefits of such bioinformatics clouds include easing large-scale data integration, enabling repeatable and reproducible analyses, maximizing the scope for sharing, and harnessing collective intelligence for knowledge discovery.

Nalashaa is experienced in working with healthcareproviders and understand the sensitivity of data involved. We offer services which cater to the highest standards in security, assurance and integrity. Our data center services comprise of data center design, security audits, load balancing and virtualization.

Know more about our IaaSoffering

Aathira Nair

An engineer by education, foraying into a medley of activities - content, social media and marketing.