donderdag 29 mei 2014

Week 5&6

In this week I took a look at the pathways and I did not find anything special to mention. Also there were not enough genes that followed the criterion from ( [logFC] < -0.585 OR  [logFC]  > 0.585) AND  [P.Value]  < 0.0 so I change it in ( [logFC] < -0.263 OR  [logFC]  > 0.263) AND  [P.Value]  < 0.0
 
The folowing tables represend the ranking with the new mentioned criterion.
 
 

Z-score Experimental VS baseline
 
Z-score Acute VS treated
The positive gene that followed the criterion rose, and also the ranking differs from the previous criterion (these tables are shown in the previous blog
 
More inflamation pathways are shown like Nf-kb, IL pathways, toll like receptors pathways
 
Quercetin and Nf-Kb/AP-1 induced cell appoptosis. The highest ranked pathway in the acute treated group.
 
It is expected that afther treatment genes in inflamation pathway will be downregulated. So I wanted to take a look at gene with an high (low) FC and a significant p-value, that were not found in pathvisio.
 
 
--> TREATED VS ACUUT
 
 
Table group acuut VS treated_genes with the lowest logFC
 
Table group acuut VS treated_genes with the highest logFC
 
Table group acuut VS treated_genes with the lowest p-value
 
 
 
--> EXPERIMENTAL VS BASELINE
 
 
Table group experimental VS baseline_ genes with the lowest logFC
 
Tab;e group experimental VS baseline_genes with the highest log FC
 
Table group experimental VS baseline_genes with the moest significantly p-value

maandag 19 mei 2014

Week 4 (14-18 May 2014)

At the end of the week I use one of the outcomming table that was also provide by R with the enssemble ID of all genes that were found and change pointed against the logFC, Fold Change, Average Expresssion, T value (outcome from the statistical T-test), P-value, adjusted P-value and the B value for the analysis in PathVisio, this file has to put in for the import of the expression import.
In PathVisio I select the human gene database HS_Derby_20130701.bridge. I also import the wikipathways Homo sapiens Curation-Tutorial (a gpml file).
In pathvisio I can created a visualization. I wanted to show the up- and down regulated geneexpression and the p-value.
At last I did an statistical test for all the pathway. I wanted the pathways to be ranked following the criteria ([logFC]>0.585 OR [logFC]<-0.585) AND [P-value]<0.05.

For the group where the treated group is compared with the group with acute malaria, the first then pathways that follows the criteria are shown.
             
             
Pathwaypositive (r)measured (n)total%Z Scorep-value (permuted)
RB in Cancer9921049,78%6,020,001
Neurotransmitter uptake and Metabolism In Glial Cells121350,00%5,280
Transport of Glycerol from Adipocytes to the Liver by Aquaporins12750,00%5,280,003
Activation of Chaperone Genes by ATF6-alpha281625,00%5,10,005
Signal amplification2115618,18%4,230,003
Thrombin signalling through proteinase activated receptors (PARs)2135315,38%3,820,005
Activation of Matrix Metalloproteinases2156613,33%3,480,009
Adipogenesis71221325,74%3,470,008
FAS pathway and Stress induction of HSP regulation335438,57%3,150,022
miR-targeted genes in leukocytes - TarBase61081285,56%3,110,006
           
             
             
             
             
             
             
             


   

For the group where the experimental effected group is compared with the baseline group, the first then pathways that follows the criteria are shown.








Pathwaypositive (r)measured (n)total%Z Scorep-value (permuted)
Type II interferon signaling (IFNG)5353814,29%10,40
RIG-I/MDA5 mediated induction of IFN-alpha/beta pathways4481818,33%6,880
Heme Biosynthesis182812,50%4,310,013
NOD pathway230436,67%4,260,004
Serotonin Transporter Activity191511,11%4,040,027
Interferon alpha/beta signaling234965,88%3,950,017
Regulation of toll-like receptor signaling pathway41201523,33%3,850,004
Apoptosis380853,75%3,620,009
Quercetin and Nf-kB/ AP-1 induced cell apoptosis111259,09%3,610,03
TAK1 activates NFkB by phosphorylation and activation of IKKs complex111309,09%3,610,035






  



















***
positive (r) -- the number of genes on the pathway that fulfill the criterion
meassured (n) -- the number of genes on the pathway that have been measured in the data set
total -- the total number of genes on the pathway
% -- the percentage of measured genes that fulfill the criterion
z-score -- the z-score as computed by a fisher exact test on overrepresentation
p-value (permuted) -- the change


Next week I planned to take a better look at these pathways, and compare these two groups (differences and comparisons), to try to link this with biological reasons.
And to take a look at the gene with a high FC and a significant p-value that is not founded by PathVisio
 




















 



















            

week 4 (12-13 May 2014)

In the fisrt period of this week Lars help me writiing the script for the statistical analysis of the data and we analyse if the figures were oke to use for futher statistical processes. This was done with the R project for statistical computing. This program made use of limma, bioDist, gplots and some other functions to make the outcomming plots for the statistical modeling.
 
In the first experiment I wanted to compare the effect of the treatment in malaria patient in comparison with the not treated patient with acute malaria. The groups were unpaired groups.
 
P-value histogram frequency, This is not what we wanted, since we want more change with a significantly and less change in the futher gene expression
 
Adapted fold change diagram
 
 
 
In the second experiment I wanted to test what the change in geneexpression in PBMC is in people that experimental got infected with malaria (presymtomatic effects) compared with the baseline group. The groups were paired.
 
P-value frequency histogram, this one look much more as what we want in an good experiment (in comparison with experiment 1).
 
Adapted fold change diagram
 
 

maandag 12 mei 2014

Week 3 (5 -11 May 2014)


At the start of this week I run the array analysis from the two different groups separately. This gave already a much better quality rapport which could be used for further analysis.

The first dataset (baseline VS experimental group) was not yet completely good. Some outliners should still be erase from the dataset.
 
In this NUSE the data of the baseline and the experimental group is shown. There are still lots of data which are far from the median
 
Even though in the NUSE plot, there is a clearly marge in the PCA1/2 (first PCA plot). ALso in PCA2/3 (second PCA plot) a marge can be seen. This all afther normalization. Still lots of outliners.
 
In this cluster dendogram the different group are cluster togeheter, this afther normalization. Still some outliner which we might consider to exclude.
 
From the second dataset (acute malaria VS treated malaria) only one subjected, treated 9, was a outliner. So also this dataset I had to run without this subject. The only disadvantage would be that there is no baseline for this group.
 
 
The hybridization controls intensities and cells were OK
 
In the NUSE plot still lots of outliners can be seen.
 
No clearly marge can be seen in the PCA plot, the ouliner might have affect on this

At the end of this week I came together with Lars and his students and we discuss our dataset, rapport and the statistical analysis. Lars would help everyone with writing a script for the statistical analyses. He wrote it in the weekend of this week and at the start of next week we will discuss the scripts.

This week I also joined the first science caffé, where a lecture about statistical analysis was given.

week 2 (28 April-4 May 2014)


At the begin of this week I started to make a good summary of the publication of my chosen dataset from which I run the quality rapport with array analysis.

I also showed my quality rapport  that week together with Egon Willighagen, Bart Smeets and discuss my founded quality rapport of the array analysis. After the discussion it could be concluded that we better could delete one of the baseline subject because this one was an outliner in all the founded tables and graphs. This can also have effect on the other results (subjects). So I run the analysis again without subject B11. The quality rapport was better, but still not what we wanted.

The green dot on the botom is Baseline 11
The green one at the top is Baseline 11 
 
It can be seen in these MA plots that the expression of the array in Baseline 11 clearly differs from the other Baseline. As example taken Baseline 10, 12 and 13, but the rest was comparable.
 
 
Later that week Lars Eijssen took a look at the rapport and further discuss it with Egon Willighagen, Bart Smeets and some other members of the department of bioinformatics at the Maastricht University. Lars advise me to split my dataset in two different groups. Because the subjects of this research came from different countries (VS and Africa) and because they also had two different project.
In this case the first dataset would contain a group from the VS that first was the baseline group and then got infected with Malaria and became the experimental group from which the pre-symptomatic effect of Malaria where compare against the baseline.

The second group would consist only subjects from Africa from which acute malaria was compared against a group with treated malaria.

At the end of this week Lars Eijsen gave a statistical lecture which all interns could (and others) from department bioinformatics could join.

maandag 5 mei 2014

Week 1 (21-27 April 2014)

In the first week (21-27 April) I started with my internship at the department of bioinformatics at Maastricht University. The project is from Egon Willighagen and it is about searching a new solution to cure Malaria by using systems biology. I will try to use a systems biology approach to analyse and to find a target which plays a role in the development of Malaria. 

In ArrayExpress I search for dataset in which malaria was compared between healthy persons. From all the founded dataset I made a table. Finally I chose for a known dataset which I will further use in which they tested the changes in gene expression in people and with and without Malaria. In the subjects with Malaria they also tested the differences between pre-symptomatic and symptomatic changes in the disease.  And thereafter I will read some articles of people which did some research in the field of the effects of Malaria and a cure for Malaria and try to find some inks between them. And if possible even find a new cure to fight Malaria.

In this first week I have read some articles which I could use for my project.
 
 
In one  article the author compared four different groups (a baseline, a group which they gave malaria, a group which has acute malaria, and a last group which is treated from malaria). Because I though this article might be useful for my project I run the dataset in arrayanalysis.org at the end of this week and planned to take a look at the results next week.

Publication: Christian F. Ockenhouse, Wan-CHung Hu, Kent E. Kester, James F. Cummings, Ann Stewart, D. Gray Heppner, ANne E. Jedlicka, Alan L. Scott, Nathan D. Wolfe, Maryanne Vahey and Donald S. Burke -- Common and Divergent Immune Response Signaling Pathways Discovered in Peripheral Blood Mononuclear Cell Gene Expression Patterns in Presymptomatic and Clinically Apparent Malaria -- Infect. Immun. 2006, 74(10): 5561. DOI: 10.1128/IAI.00408-06.