Skip to content

Conversation

@hexylena
Copy link
Member

New Features

  • bam
  • auto-generated SNP from BAM
  • bigwig
  • bigwig options like XY vs Density, display of variance_band
  • bigwig options like min, max
  • vcf
  • blast XML
  • Perl/Perl module dependencies
  • samtools dependency.

Preview:

Visualize ALL the data

Questions:

  • is it OK to run bgzip/tabix on vcf data, or is there another way I should be accessing that? (Yes)
  • should I expose colouring options? Isn't there a new UI element for colours? (Not yet)
  • should I re-add wig support? I removed it in lieu of bigWig only (No)
  • should I support pre-set min/max values for BigWig data (normally it auto-scales)? (Yes)

@hexylena hexylena self-assigned this May 13, 2015
@bgruening
Copy link
Member

Questions:

  • is it OK to run bgzip/tabix on vcf data, or is there another way I should be accessing that?

Galaxy has an integrated converter for it: https://github.com/galaxyproject/galaxy/blob/dev/config/datatypes_conf.xml.sample#L204 I also hope to make use of this for visualizations. Not sure about this though.

  • should I expose colouring options? Isn't there a new UI element for colours?

Yes, @guerler is working on this. Imho it is not yet in dev.

  • should I re-add wig support? I removed it in lieu of bigWig only

I would got with bigWig support only. Galaxy has converters if someone really cares.

  • should I support pre-set min/max values for BigWig data (normally it auto-scales)?

I guess this is a important options, but than again it depends on your use cases.

if len(json_data.keys()) == 0:
return

print json_data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug output?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, will remove before it's out of WIP

@hexylena
Copy link
Member Author

@bgruening

Galaxy has an integrated converter for it

okay, then we just need to make sure the indices are attached like for bam files. Added to TODO

I would got with bigWig support only. Galaxy has converters if someone really cares.

Great, sounds good. I want it to be easy, and as long as Galaxy will list wig files that can be auto-converted for use...

should I support pre-set min/max values for BigWig data (normally it auto-scales)?

I guess this is a important options, but than again it depends on your use cases.

Just want a general utility JBrowse tool which exposes a lot of the tougher to deal with JBrowse configuration in a user-friendly way. Trying to find a happy place where it isn't an overwhelming amount of configuration. I'll probably add in the min/max then.... I really want people to use this and tell me what needs to be fixed for their datasets...

@hexylena
Copy link
Member Author

BlastXML to gapped GFF3 will be part of this PR soon.

@peterjc you have any thoughts/comments/opinions on the following graphics?
screenshot from 2015-05-20 18 33 23
screenshot from 2015-05-20 19 03 00

@peterjc
Copy link
Contributor

peterjc commented May 21, 2015

Pretty graphics :)

See also peterjc/galaxy_blast#61 plus there have been some recent discussion about returning BLAST XML etc from the Tool Shed to the Galaxy core. If that happens then BLAST XML to gapped GFF3 could also go in the core...

@hexylena
Copy link
Member Author

Thanks @peterjc, thought I'd ping you since you're the blast person...

Definitely, I'd love to see all the blast stuff in core, I'll be happy to PR that PR with a datatype converter.

@hexylena
Copy link
Member Author

Now with colouring for e-value.
Getting pretty close to a full replacement for, with more features than, trackster ...

screenshot from 2015-05-21 12 43 07

@hexylena hexylena changed the title [WIP] JBrowse Update JBrowse Update May 22, 2015
@hexylena
Copy link
Member Author

I don't expect anyone to review this before next week, but if anyone feels like it, this is out of WIP and ready to be merged. :)

(Additionally I recognise that pet-projects have a different relationship with the IUC repo than normal packaging efforts.)

@hexylena hexylena removed their assignment May 22, 2015
gff3_unrebased.close()

gff3_rebased = tempfile.NamedTemporaryFile(delete=False)
cmd = ['python', os.path.join(INSTALLED_TO, 'gff3_rebase.py')]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both files are located in the same directory, why not importing the main function and using it without tempfile and subprocess?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the thinking with tempfile was that it will be better behaved than reading/storing the entire data structure in memory. However, the parser isn't written in a very "streaming" fashion right now, so huge amounts of data will still be stored in memory.

You're absolutely right though, I'll switch over to importing :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to switch over?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shoot, hadn't tested. Will test now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened to this one? I'm fine with it being called as subprocess, just wanted to be sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave it as is for this round, since I know it is already working, and the testing for jbrowse isn't easy yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be PRing another jbrowse update soon to do the following:

  • switch over to importing (maybe?)
  • add test cases
  • remove the sample_data folder from use in the package
  • add more configurability for track plotting
  • figure out whatever bug makes viewing bam/bigwig data nearly impossible inside of galaxy. outside of galaxy everything works fine.

@bgruening
Copy link
Member

@erasche;

  • you need to add BCBio & biopython, samtools (aka tabix, bgzip), perl? as dependency
  • I would put blastxml_to_gapped_gff3 into a separate repository

See my other small comments inline.

@hexylena hexylena changed the title JBrowse Update [WIP] JBrowse Update May 24, 2015
@hexylena
Copy link
Member Author

@bgruening

  • bcbio/biopython should already be there.
  • samtools and a bunch of perl libs will have to be added, thanks.
  • I would like to separate out blastxml_to_gapped_gff3 and gff3_rebase since those are useful standalone tools, but I didn't know where to put them. xml2gff3 makes sense as a datatype converter if blast datatypes move into galaxy, but it has user-editable options and I don't remember datatype conversion supporting that. gff3_rebase will be useful for our annotation pipelines, but the IUC repo is mostly tool suites rather than individual tools. Should I open a PR with those in separate folders?
  • moved back to WIP until I can fix your suggestions.

@bgruening
Copy link
Member

@erasche I don't see the IUC repo only for tool suites. Feel free to add those converters as standalone tools.

@hexylena
Copy link
Member Author

@bgruening okay! Will do!

@hexylena
Copy link
Member Author

@bgruening do you have an examples of having a tool repo be a dependency? e.g.

<?xml version="1.0"?>
<tool_dependency>
  <package name="jbrowse" version="1.11.6">
    <repository name="package_jbrowse_1_11_6" owner="iuc"/>
  </package>
  <tool name="blastxml_to_gff3" version="1.0">
    <repository name="blastxml_to_gff3" owner="iuc"/>
  </tool>

? How does one write the XML for that, I'm not seeing any examples.

Or do I need to:

  • publish the python code somewhere
  • make that into a package_
  • separate out the XML files in to tools?

That seems awfully complex.

Or should I just copy+paste the scripts into this directory and keep them in a separate repo as well?

# may be longer than the parent feature, so we use the supplied
# subject/hit length to calculate the real ending of the target
# protein.
#print hsp.align_length, hit.length
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, that was from testing/debugging the math. Removed.

@bgruening
Copy link
Member

@erasche please also see my old comments.
What about symlinking your files as much as you can to avoid duplication?
Overall, impressive work!

@hexylena
Copy link
Member Author

@bgruening thanks for the amazing feedback, I really appreciate it. I'll get fixes implemented shortly.

@hexylena
Copy link
Member Author

@bgruening okay, should be closer to ready. Used symlinks everywhere possible.

@bgruening
Copy link
Member

@erasche you can also symlink on git level to avoid code duplication. Thanks for the fixes.

@hexylena
Copy link
Member Author

Okay, travis is failing now due to the bundle. I'll have to correct that first thing tomorrow along with git symlinks.

@hexylena
Copy link
Member Author

@bgruening should be good to go, if travis tests pass.

bgruening added a commit that referenced this pull request Jun 11, 2015
@bgruening bgruening merged commit fd63087 into master Jun 11, 2015
@bgruening bgruening deleted the jbrowse_02 branch June 11, 2015 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants