semgen Providing innovative products and services in bioinformatics


Mass spectrometry technology

Protein reader is technology prototype for accurate protein identification using mass spectrometry (MALDI TOF/TOF) combined with LC/MS. With this combination of technologies we can easily identify unknown microorganism species (biotypisation), find protein mutations and soon we’ll be able to determine post-translational modifications. We are using new patented molecule CAF-/CAF+, developed at Rudjer Boskovic Institute, for facilitated peptide fragmentation without sample degradation.



Genomics technology

Rapid advances in methods for genome screening, sequencing and analysis are constantly redefining understanding of genome structure, composition and activity. The employment of techniques such as microarrays, tiling arrays and the most recent non-Sanger method based DNA sequencers further accelerates this process.

These technologies have enabled new discoveries, which inevitably now challenge our view on what has previously been taken as an established fact, such as junk DNA for example. These new findings indicate that portions of this junk DNA are attributable to un-translated RNAs and microRNAs being important in cellular regulatory functions. As a consequence, this has prompted re-sequencing and re-analysing of established genomes.

Biological data quantity and complexity are also increasing exponentially. This has lead to the need for more revolutionary and efficient means of storing and analysing genetic data. Conventional flat file and relational database structures have become impractical and increasingly obsolete.

SemGen uses a completely novel and proprietary approach to construct databases, whereby abstracts available though PubMed are linked to underlying genome annotations. The end result is a "Custom Genome Database" which offers natural language type queries to reveal underlying gene interactions.

The combination of a powerful data pipeline engine and a flexible, scientific architecture enables SemGen to uniquely address challenges faced by research and development organizations.

SemGen uses technologies which offer:

  • flexible, modular and service-oriented architecture integrating any existing bioinformatic tools,
  • data management, analysis and reporting for various biological data including nucleotide and amino acids sequences, protein modelling, phylogeny, chemical structures, numerical, and text data,
  • rapid application prototyping,
  • configurable components for data retrieval, manipulation, computational filtering and graphical display,
  • "custom tailored" component capability leveraging standard technologies including SOAP/Web Services, Perl, Java technologies such as server side Spring and Hibernate, client side Eclipse RCP and GWT, and relational database DBMSs like PostgreSQL,
  • integration to third-party applications, databases and existing scientific infrastructure including Microsoft Excel, PubMed, PubChem, and other sequence and protein data repositories,
  • grid systems (GridGain) for process optimisation.