Category Archives: Bioinformatics
The experiment with integrating SAP HANA into teaching and research here at Missouri S&T is paying off. Last week I observed our Business and Information Technology, BIT, students presenting their ERP Simulation projects to a team from Deloitte SAP Service Line. What caught my eye was that the students are now incorporating data from our Autism gene mapping research project, which is a university research project that my IT DBA staff are collaborating on in order to learn how to better support SAP HANA. This goes back to my original strategic decision to invest in SAP HANA to allow our researchers and students to align more closely with the desires of our corporate employers. See my blog post from last year. I elaborated on the concept of IT’s changing role as a facilitator of teaching and research in this article published in “CIO Review” last Fall. Observing our students understanding of the potential of SAP’s HANA for the Business Intelligence support for their projects is justification enough for the investment. But the excitement is now being generated by how HANA fits into our overall STEM teaching and research environment.
The Autism project was a fortunate opportunity to learn and explore the potential of HANA. Feedback from my DBA’s about how HANA is different from their traditional relational database experience is encouraging as well. What I hear is that HANA is initially daunting in it’s complexity. However, it makes the initial database layout easier because it shows you so many more possible relationships. Of course this is the Hadoop foundation based on large in-memory utilization. The HP SAP HANA appliance just packages it all into a more effective tool chest. Combine HANA with an already rich set of BI and Visualization tools, then let talented students run with it and you see the potential is endless.
Back to the Autism Project, the study is fascinating, especially to me with my bioinformatics background. The research investigators include: Drs. Tayo Obafemi-Ajayi, Bih-Ru Lea and Donald C. Wunsch. Here is a portion of their abstract:
Several studies conducted on autism gene expression analysis suggest that autism can be linked to specific genes though there are still no genetic markers that are undeniably diagnostic for idiopathic ASD. What is known is that the genetic landscape of autism is complex, with many genes possibly contributing to the broad autism phenotype. Genetic data analysis involves big data analytics. The ASD HANA in-memory database project will facilitate the goal of the ECE researchers to develop novel computational learning models for analysis of ASD genetic data. The genotype data of these ASD patients is available through the Simons Simplex Collection (SSC).
So the research is progressing and we expect significant new funding thanks to the proof of concept work already done. Chalk up a win for stimulating research. But another win is how the students have applied a portion of the data to create BI class projects. Now they see the connection to the Health Science industry. Because we now understand the potential of HANA we have also validated a research connection for the petroleum industry. This was the hope for the HANA investment, a perfect storm matching STEM savvy Business students with corporate recruiters identifying research ideas is a Win for all. This is the type of IT support flexibility needed by the emerging higher education teaching and research model of the future.
I checked out Coursera’s course offerings and I have to admit they have a great lineup of quality courses. I signed up for “Introduction to Logic” from Stanford which begins soon so I could evaluate the process and quality of delivery, plus I am somewhat interested in logic. Then I signed up for “Introduction to Genome Science” from University of Pennsylvania for a fun refresher to my MS in Bioinformatics where my thesis was “Security of Our Personal Genome”. Purely continuing education but what a huge market that could be. You do realize this is wave 2 of open courseware. Coursera’s quote: We are changing the face of education globally, and we invite you to join us. Let’s assume Coursera is able to competently deliver these courses to any number of students. And let’s assume their student assessment techniques allow them to validate that learning took place. They have the prestigious of elite institutions of higher education. What does this mean?
What if a year from now millions of people are successfully completing courses through Coursera, Udacity and probably other copycat competitors. First Coursera is going to be worth billions and second a benchmark will be established that will define what is a quality online course. What will this benchmark mean? It will eliminate the argument that legitimate For-Profit online providers lack in quality. But more important it will validate the other argument that many of the online courses from traditional non-profit institutions are not worth the bandwidth you are wasting on them. So what does this mean for most of us (higher education)? Our online or blended offerings which we realize we must offer will have to be of similar quality to the free offerings from the Coursera’s of the world. We will have a benchmark. And then we just worry about holding on to our control of accreditation for validating what is a college degree and what is it worth. I am thankful that we will still have the value of the campus experience, but again, what will it be worth.
Update July 17, 2012 – More research universities join Coursera
Hot topic on the news wire lately is the over-the-counter availability of “Your Personal DNA Report” from Pathway Genomics. This service involves sending in a saliva sample to a “Certified Lab” where comprehensive genotype testing will be performed regardless of your decision to order a report at additional cost. The first concern here is that the comprehensive genotyping data will be stored by Pathway Genomics. The second concern is that you believe that storage record is anonymous.
So why am I talking about this. Only because these things dredge up all the predictions and fears that I had in writing my 2003 thesis, “Security of Your Personal Genome” for my MS in Bioinformatics. I do think it is dangerous for us to venture down this path. I’m not sure we should tread so close to God like knowledge. But my real fear is that if a record of your genome exists, the risk of it getting into the wrong hands is not worth the knowledge it may bring you. As for anonymous, wrong, if your genomic information is compromised or is shared, and it could be under this “anonymous” label. It is not difficult for today’s search algorithms to match characteristics from the genome with say protein results from a past blood test or maybe your family tree had a certain trend toward a disease. One little link of information is enough to complete a match to you. So with the direction health care is going I would not want an insurer or an employer to be able to make a calculated estimate about my health future. Of course I might be OK since my grandfather lived to be 102 and my dad who is 95 expects to beat that mark. Just be careful and really think about how you control or protect your personal genome, it is your ultimate ID.
May 18, 2010 – Unbelievable – UC Berkeley having incoming freshman submit to a DNA analysis to stimulate discussion about nutrition.
May 28th – Unwinding Berkeley’s DNA Test, InSide HigherED article.
August 13 – The California Department of Public Health demands that Berkeley change its freshman DNA experiment. Double Helix Trouble, InSide HigherED article.