VisuMap

VisuMap is a high dimensional data visualizer. It provides a number of dimensionality reduction methods like principal component analysis, Sammon mapping, curvilinear component analysis, relational perspective map and SMACOF MDS. It also has a few data clustering methods such as K-mean clustering, agglomerative clustering, self-organizing map and metric sampling.

The website contains some sample maps, sample datasets for you to work on. There are also white papers, and demo videos on the software (which is only available after you register with the website).

To evaluate this software, I used my own dataset. In one of my previous research, I gathered three congeneric groups of compounds: penicillins, cephalosporins, fluoroquinolones. I compute fingerprints (1025 dimensions) using openbabel for these compounds and combined them into one dataset. Then I load the dataset into VisuMap and run it through each of the different dimensionality reduction methods.

pca3d.jpg

Results from Principal component analysis. Yellow squares are cephalosporins, Red circles are penicillins, Blue triangles are fluoroquinolones

sammon2d.jpg

Results from Sammon mapping. Yellow squares are cephalosporins, Red circles are penicillins, Blue triangles are fluoroquinolones

cca2d.jpg

Results from Curvilinear component analysis. Yellow squares are cephalosporins, Red circles are penicillins, Blue triangles are fluoroquinolones

rpm2d.jpg

Results from Relational perspective map. Yellow squares are cephalosporins, Red circles are penicillins, Blue triangles are fluoroquinolones

mds2d.jpg

Results from SMACOF MDS. Yellow squares are cephalosporins, Red circles are penicillins, Blue triangles are fluoroquinolones

All the pictures above (except PCA) are the 2D maps produced by the various algorithms. Although the software can also produce 3D maps, it is not easy to visualize them as the software does not provide very good controls for rotating the map. I could not get the 3D animation to work in my VMWare machine so I don’t know whether it provides an easy way to view 3D maps. It will be good if the software adopts the way that molecular structure viewer software like Sybyl handles 3D structures (i.e. hold down right mouse button and move the mouse to rotate).

It can be seen from the pictures that the algorithms PCA, Sammon and MDS did a very good job in showing that there are three distinct groups from the obvious separation between the groups (The colours and shapes of the different groups were added in manually to enhance the visual effects. Bear in mind that when you process a dataset with unknown groupings, every point will appear to be the same. Thus the only way to differentitate groups is if there is an obvious separation band). For the other algorithms, the separation between the groups are not as good, although it can be seen that members of each group does not mix with those from other groups. The Sammon and MDS algorithm also correctly showed that penicillins and cephalosporins are closer to each other than they are to fluoroquinolones.

Share This

5 Responses to “VisuMap”

  1. krishnakumari Says:

    You said you have worked on 1025 dimensions. What about the amount of data you have taken? (1 million / 0.5 million etc)
    i.e datasize you have considered?

  2. James Says:

    Thanks for reviewing our software. The 3D animation service in VisuMap
    requires DirectX library as documented in the installation guide.
    The navigation of the 3D maps is very similar to that
    of the PCA window, except that it is much faster for large datasets
    (>5K data points). The 3D navigation interface
    is modeled like GoogleEarth, so that you can virtually
    fly within your data using your mouse.

    For most mapping algorithms in VisuMap the dataset size is limited
    to 5000 to 10000 data points. If you have more data points, you
    should use one of the integrated clustering algorithms to
    reduced dataset size to a more manageable size. For instance,
    you can easily reduce a dataset with 1 million data point to
    few thousands clusters with the self-organizing map within few hours.

    You can also use the clustering services to color data points
    automatically according the their clusters

    It should also be pointed out that mapping algorithms like Sammon map and PCA emphasize on the global inter-cluster structure, whereas other mapping algorithms (like the RPM and CCA) emphasize more on the details within clusters.

  3. Yap Chun Wei Says:

    Hi Krishnakumari,

    The size of the dataset used is only 171 compounds.

    Hi James,

    Thanks for the clarification. Yes, I am aware that VisuMap requires DirectX library but since my machines are all linux-based, I can only run it using Windows that is under VMWare. I had actually also asked my graduate student to try the software and she mentioned that the 3D navigation control is rather good.

    I will be looking more into the clustering algorithms next. However, as I had mentioned in my previous post, I am not an expert in visualization software. I am still learning how to best utilize such software for research, in particular in QSAR research. Thus I hope readers don’t really treat these posts as reviews on the software but rather treat them as just personal observations of an amateur using the software. I will welcome comments from readers and yourself on how to more effectively use such software and correctly any inaccuracies that may inadvertently arise.

  4. James X. Li Says:

    It should be noticed that, apart from mapping algorithms, the distance metric
    you choose to measure the dissimilarity between data points also plays an very important role in these kind of analysis. If you are using the fingerprints vectors which, i suppose, are binary feature flags, I would suggest to try other metrics like Jaccard or Dice distance.

    VisuMap allows you to plug-in your own distance functions to characterize dissimilarities. For those standard metrics, like jaccard distance, there are free ready-to-use plugin modules.

  5. Richard Says:

    James,

    How can I contact you ? Your server/site bounces and won’t redirect email.

Leave a Reply


Close
E-mail It