-
Notifications
You must be signed in to change notification settings - Fork 5
compatibility for python2 and python3, including C API, and revisions as to how package built #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Speed issue
…into speed_issue merged
Speed issue
Comments: RE: C API, FORTRAN, and wext/setup.py I wrote this to be both 2.7 and 3.x compatible. The C API changed a good deal in Python3.x, but it's quite slick. Hopefully these changes make sense. I didn't touch the Fortran code, but I did revise wext/setup.py to compile both the C and Fortran code into separate modules. It appears to work on Travis, as well as the Linux server. I also recommend users should just install this, so the modules are individually "importable". RE: README The Travis-CI config will have to be activated on your end. Otherwise, the badge will fail. I wish the README contained a bit more motivation/details about WExT, and why this should be used RE: py23 compatibility changes --- Any '.iteritems()' became '.items()'. --- We were previously using implicit relative imports. Python does not like this----these need to be explicit relative imports, e.g. https://github.com/evanbiederstedt/wext/blob/master/wext/exclusivity_tests.py#L5-L6
--- This needs to explicitly be a tuple() for Python3.x https://github.com/evanbiederstedt/wext/blob/master/compute_mutation_probabilities.py#L49 --- The most interesting change is adding 'xranges()' from future, which is now a required dependency. https://github.com/evanbiederstedt/wext/blob/master/compute_mutation_probabilities.py#L12 Normally, I would simply convert this to https://github.com/raphael-group/wext/blob/master/compute_mutation_probabilities.py#L109 e.g. It appears the way around this is to use xrange() for both versions 2 and 3, which is what I've done. |
thanks!
seems that all N/ divisions should be replaced with the int division operator // in examples/generate_data.py
possible solution?..
-running experiments/eccb2016 in find_exclusive_sets.py
fixed in def get_permuted_files(permuted_matrix_directories, num_permutations):
etc. |
Hi @zmiimz Thanks for the comments!
This is true that we should update the README if this pull request is accepted. Currently, it looks like I have
Could you show me which commands you used to get this error? I just tried with
For all of the issues you discuss, could you please provide the version of python you used, and the command you used? That would be helpful moving forward---it's a bit difficult to follow what you've done so far. Thanks, Evan |
started examples and eccb2016 experiment calculation |
Hi @zmiimz Please provide as many details as possible so I can proceed to investigate the errors you report, i.e. the python version, the version of gcc, how you installed the package, and the commands you are running. I'm running this on Mac OS 10.13.6 Versions of python and gcc
I've attached a text file of the commands I use for installation and running Thanks, Evan |
Dear Evan, |
Hi @zmiimz Thanks for the response. Please keep me updated. I'm going to try out the Thanks, Evan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all of these changes. I tested the simple example, and the code successfully runs and returns equivalent results on Python 2 and 3. I made several comments but all for minor issues.
- The package
future
is needed for both Python 2 and 3. - Most Python 2/3 functions work fine with iterators whether they are lists (like
range(n)
in Python 2) or generators (likerange(n)
in Python 3) --len
is a notable exception for some reason. I flagged some of these because they make it a little harder to read the code and slow things down the code at times but you can either leave them as-is or change them as needed. print(a, b, c)
prints a tuple (a, b, c) in Python 3. Maybe replace withprint(a + b + c)
so that Python 2 and 3 return same output.
this_dir = os.path.dirname(os.path.realpath(__file__)) | ||
sys.path.append(this_dir) | ||
from wext import * | ||
from past.builtins import xrange |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need xrange
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my sole motivation here was consistency :)
Above I detail the memory/performance issue associated with range()
vs xrange()
between Python2.x and 3.x. So, I may have simply started using this everything to avoid any potential issues. (A downside to not taking the time to unit test this is that I didn't think deeply about each use of range()
, unless it became obvious it was a problem.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine, and the use of xrange
in
seeds = random.sample(xrange(1, 2*10**9), args.num_permutations)
is the only really important one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can keep xrange
here and other places -- not a big issue.
#!/usr/bin/env python | ||
|
||
import numpy as np | ||
from past.builtins import xrange |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may be entirely correct, and I'm happy to accept this.
See previous comments on range()
vs. xrange()
and consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big issue. We can leave as-is.
print '-' * 14, 'Correlation: WRE (Saddlepoint) and WRE (Recursive)', '-' * 14 | ||
print 'All: \\rho={:.5}, P={:.5}'.format(*all_correlation) | ||
print '\Phi_WR < 10^-4: \\rho={:.5}, P={:.5}'.format(*tail_correlation) | ||
print('-' * 14, 'Correlation: WRE (Saddlepoint) and WRE (Recursive)', '-' * 14) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Python 2, print(a, b, c)
prints (a, b, c). Replace commas with plus signs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace commas with plus signs?
Good catch. Yes, let's use plus signs.
geneToLengthRank.update(zip(geneToLength.keys(), length_ranks)) | ||
threshold_gene = sorted(geneToLength.keys(), key=lambda g: geneToLengthRank[g])[args.length_threshold] | ||
print 'Length of {} longest gene: {}'.format(args.length_threshold, geneToLength[threshold_gene]) | ||
geneToLengthRank.update(list(zip(list(geneToLength.keys()), length_ranks))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both list
s unneeded here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this; you are correct, these are not necessary.
setToPval.update(pval.items()) | ||
setToTime.update(time.items()) | ||
setToObs.update(obs.items()) | ||
setToPval.update(list(pval.items())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice dict.update(iterator)
but not list
needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree
for M, pval in setToPval.items(): | ||
if setToFDR[M]<=fdr_threshold: | ||
X, T, Z, tbl = setToObs[M] | ||
row = [ ', '.join(sorted(M)), pval, setToFDR[M], setToRuntime[M], T, Z ] + tbl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error on line 66 if rows = []. Between lines 65 and 66, maybe add if rows:
and indent lines 66-73 so no output if no results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see what you mean. Yes, I'll correct this.
Latest tested version in parentheses. | ||
[](https://travis-ci.org/raphael-group/wext?branch=master) | ||
|
||
1. Python (2.7.9) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
future
is also currently needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments above on the README; I think I was thinking the requirements.txt
addressed all Python dependencies.
Something like pip install -r requirements.txt
should address concerns of all users, I think. (Could be wrong, happy to change.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but not everyone will think to check requirements.txt
or run pip install -r requirements.txt
. Others may think, justifiably, that running python setup.py install
successfully means that everything ready to go. For a real-life example, it crashed for me because I tried it in a new virtual machine with the usual dependencies but hadn't needed future
yet.
If we add it to the README, then we'll save a few emails, GitHub issues, and StackOverflow searches, which is worth it for everyone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add it to the README, then we'll save a few emails, GitHub issues, and StackOverflow searches, which is worth it for everyone.
Absolutely, and I agree with this as a philosophy whole-heartedly. If there are any GitHub issues, it's 99% the fault of the developer, even if the problem is one of presentation.
Yes, but not everyone will think to check requirements.txt or run pip install -r requirements.txt.
So, the motivation above wasn't to remove this information. Rather, I'm trying to cater to the lazy user (myself included) which reduces GitHub issues, e-mails, SO questions, etc.
When I look at new tools in bioinformatics, I want to see a succinct description of the algorithm/code within seconds in the README. This is needed for WExT, at the top of the README. In the current README, requirements are given first. The requirements should be made more comprehensive (e.g. add the python libraries necessary with libraries required as detailed in requirements.txt
) and I think it shouldn't be the first think seen in a README. We need to aim for both succinct and comprehensive :)
The rationale behind this is presentation for new users (as an invitation to use the code), as well as preventing unnecessary Github issues/e-mails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Let's move requirements lower in the README if they are too long.
LGTM -- thanks for all of the changes! Let's fix the |
No description provided.