A non-linear world.

The world is not linear.

Most of the time you need multiple variables.

Sometimes you would need a square, a cube, or even a log.

Even with lots of trials, it might still be difficult to fit in.

And, if you try to hard, you might end up being too over-fitting.

After all, it is quite a curvaceous world.


The Mexican Mestizo populations genome

Accidentally stumbling upon this paper about the Mexican genome study.

The Mexican people have already ventured into their genome since 2009! Analysis of genomic diversity in Mexican Mestizo populations to develop genomic medicine in Mexico, published in PNAS 2009.

Although the sample size was not very big and the genotypes were done on a platform with only 100000 SNPs, they started this nine years already.

Anonymous blood samples from 300 non-related and self-defined Mestizos and 30 Amerindian Zapotecos were collected in 7 states in Mexico: Guanajuato, Guerrero, Sonora, Veracruz, Yucatan, Zacatecas, and Oaxaca (ZAP). Genotyping was performed according to the Affymetrix 100K SNP array protocol…

May be there are several other publications coming out after this one?

HFS+ File system is NOT case-sensitive but case-preserved!

  • On MacOS system, cloned a git repository with two directories: 1) Assignment and 2) assignment. [Notice the A vs a].
  • Couldn’t find the folder “Assignment” locally
  • Remote git seems to have this folder.
  • Tried to find out whether local git repository was not updated or not.
  • Checked the commit ID, checked the log and confirmed that the local repository is up-to-date!
  • OMG … what happened!!!!!!!!!!!!!!!!!!!!!!!!!!!
  • Tried to check whether the correct git command was used. Read about git fetch, git pull, etc, etc.
  • Feeling frustrated.
  • Got no work done.
  • Suspicious that may be there’s some problem with case-insensitive on MacOS. Therefore, tried Google “macos filename case sensitive”.
  • Finally, found the answer to the problem on Google https://apple.stackexchange.com/a/22304/179773


See the result below.


case insensitive macos
case-insensitive but case-preserved

AWS S3 Bucket Policy Setup for Specific Bucket Access

Setting: You want to allow user to upload data to S3 bucket using amazon cli, but do not want this specific user to see what other buckets are there in you aws account.

Solution: This can be done by setting up a policy below.

"Version": "2012-10-17",
"Statement": [
"Sid": "some_number",
"Effect": "Allow",
"Action": [
"Resource": [

If you also want to user to list all other buckets as well. Add the following additional statement to the statement section

"Effect": "Allow",
"Action": "s3:ListAllMyBuckets",
"Resource": [

Note: Replace “bucket-name” with the name of your bucket. Also, note the Sid should be your Sid. I use the “policy generator” to help generate the policy by modifying the setting from the reference below.

Listing the content of bucket-name

aws s3 ls s3://bucket-name --region ap-northeast-2 --profile s3-bucket-username

Uploading the directory myfile_folder to the bucket

aws s3 cp myfile_folder s3://bucket-name --region ap-northeast-2 --profile s3-bucket-username

You can also try sync function

aws s3 sync myfile_folder s3://bucket-name --region ap-northeast-2 --profile s3-bucket-username

Ref: http://mikeferrier.com/2011/10/27/granting-access-to-a-single-s3-bucket-using-amazon-iam/

Reviewing NGS Variant Call with IGV

Although I don’t really support doing a brute-force approach doing manual variants review, if you only have some of your top signal that you would like to confirm for further wet-lab experiment validation, IGV might still proves helpful. 

This review by Robinson, et al from a group at UCSD shed some lights and detail into how you can do the manual review in IGV: http://cancerres.aacrjournals.org/content/77/21/e31

I also found the IGV manual describing all the options in the preference menu to be quite useful: https://software.broadinstitute.org/software/igv/Preferences

Improving the quality of cancer tissues for research

Through careful characterization of specimens, a new study has come up with some conclusion on how we can improve the quality of cancer specimens for research.

Read the summary on NCI Blog post https://www.cancer.gov/news-events/cancer-currents-blog/2018/improving-cancer-research-biopsies?cid=eb_govdel

Full article is published in Journal of Oncology Practice: https://www.ncbi.nlm.nih.gov/pubmed/?term=30285529


Using public exome database as your control in WES association studies

Checkout the new software release TRAPD, which stands for (Test Rare vAriants with Public Data) https://github.com/mhguo1/TRAPD

Read the detail on the article published in AJHG this month at https://www.cell.com/ajhg/fulltext/S0002-9297(18)30284-2