Python 3.x support + PEP8 Cleanup by bsandeepan · Pull Request #10 · bids-standard/pyedf

bsandeepan · 2020-07-12T06:11:30Z

Kindly review the code and let me know if the commits are okay. I will add the docstrings for functions and methods in a separate pull request as @palday asked me to do earlier.

EDF.py

palday · 2020-07-13T09:27:00Z

EDF.py

-# meas_info should have ['record_length', 'magic', 'hour', 'subject_id', 'recording_id', 'n_records', 'month', 'subtype', 'second', 'nchan', 'data_size', 'data_offset', 'lowpass', 'year', 'highpass', 'day', 'minute']
-# chan_info should have ['physical_min', 'transducers', 'physical_max', 'digital_max', 'ch_names', 'n_samps', 'units', 'digital_min']
-####################################################################################################
+    return (str(buf) + ' ' * num) if (num >= 0) else buf[0:num]


We don't have tests to show this, but this looks exactly like the tye of thing that will break in Python 3. In the padding case, a string is returned. In the trim case, it's unclear whether a string or bytes are returned.

This syntax is globally accepted and in my system, it has not break in both py2.7 and py3.x yet. You may test it individually and let me know if it does.

This is one of the downsides to runtime typing. The syntax is valid, but you might still get an error depending on the types passed.

I just checked the code and errors are unlikely. See,

padtrim() is a helper function used to standardize header content to pass into writebyte(). In Python 2.x, strings and bytes are synonymous unlike in python 3.x. If you check the logic of writebyte(), it has error-handling support which will deal with both bytes and string formats i.e. output will be a bytes value. If some other format is returned (such as float or boolean), which is very unlikely logically, then we have to worry. I am not sure how I should write a unit test for least likely scenario

There is an advantageous to having consistent types internally. For example, this could be fixed by having the trim case also convert to str. Alternatively, if doing the preferred way of dealing with str in Python 3, we assume that everything internally is str and only the actual I/O operations are dealing with bytes and those are doing the conversion appropriately (then we don't need the conversion to str in the pad case). writebyte handles the output bit of that, and I think (but haven't checked) that all of the file.read are explicitly converted to str (or int).

palday · 2020-07-13T09:28:51Z

EDF.py

+        self.init_properties()
+        if fname:
+            self.open(fname)
+
+    def init_properties(self):
        self.fname = None
        self.meas_info = None
        self.chan_info = None
        self.calibrate = None
        self.offset    = None
        self.n_records = 0
-        if fname:
-            self.open(fname)
-
+


I don't understand why this change was necessary. If it were possible to initialize the properties elsewhere, this would make sense, but that isn't the case. Even then, init_properties should be private: _init_properties.

The function init_properties() contains a few value reset lines. These lines are actually called again in present line 118 when we are closing the EDF file. This function wrap only reduces the redundant lines and now the original call is covered in init()

I am explaining this after 10 months of original edit. If it sounds mistaken, let me know.

EDF.py

.gitignore

palday · 2020-07-13T11:03:01Z

EDF.py

-####################################################################################################
-# the following are a number  of helper functions to make the behaviour of this EDFReader
-# class more similar to https://bitbucket.org/cleemesser/python-edf/
-####################################################################################################


don't delete this

This section is covered in details in the added documentation -
Helper Methods: The following are a number of helper functions to make
the behaviour of this EDFReader class more similar to
https://bitbucket.org/cleemesser/python-edf/
get_signal_text_labels: Convert Signal Text Labels from unicode
to strings.
get_n_signals: Get Number of Channels.
get_signal_freqs: Get Signal Frequencies.
get_n_samples: Get Total Number of Samples.
read_signal: Reads Entire Signal Record and returns the
values as an array.

There is value to having the documentation close to the relevant code.

EDF.py

README.md

EDF.py

palday · 2020-07-13T11:12:29Z

EDF.py

+    implementation details, check out https://www.edfplus.info/index.html
+"""
+__version__ = "0.4.0"
+__author__ = "https://github.com/robertoostenveld | https://github.com/bsandeepan95>"


I think traditionally this is name + email address. Note also the distinction between author, credits/contributora, and maintainera.

Yes. I don't know robert's mail id so I put it like this. Even in original code, he has not added his email id or anything as far as I have checked. I wished to do it properly using his name as author and ours as contributors but I don't have the necessary resources. Hence I had to wait till this code review.

@robertoostenveld please send me your email id so I can add it properly.
@palday you too.

If you look at the output of git log locally, you'll see committer email addresses. 😄

@palday your email id is not there.

palday · 2020-07-13T11:18:49Z

One thing I forgot: write_byte really should be write_bytes because (1) it's usually more than 1 byte of data being written and (2) it's a helper function to write things of the bytes type.

bsandeepan · 2020-07-27T13:56:33Z

@palday I am new to this pull requests so I there are some changes that I wish to do right away and commit but it seems there are many such ones and github is providing me a "Commit Change" button. If I use that button, will it make numerous new comments for each change I save, or will it stage them together later when the pull is successful? Seeking your guidance.

palday · 2020-07-27T22:40:35Z

I think "Commit change" creates a new commit for every single change, but lots of small commits are fine -- especially because they'll all be squashed during the merge.

Thanks for doing this: I know it's a lot of effort, and we do appreciate this. My review is critical, but it's meant to be constructive.

palday · 2020-07-29T12:16:40Z

EDF.py

+    return (str(buf) + ' ' * num) if (num >= 0) else buf[0:num]
+
+
+def writebyte(file, content, encoding='utf8'):


Suggested change

def writebyte(file, content, encoding='utf8'):

def writebyte(file, content, encoding='ascii'):

writebyte should default to using ASCII encoding because that's what the EDF spec actually requires.

I remember you asking me to do this a year ago. I forgot. Apologies. 😅

palday · 2020-07-29T12:23:26Z

EDF.py

+    try:
+        # Py3 onwards bytes and strings are separate data format
+        content = bytes(content, encoding)
+    except TypeError:
+        # Py2.7 Support
+        pass
+    except Exception as e:
+        print(type(e))
+        print(str(e))
+        print(
+            "If you see this message, please go to " + 
+            "https://github.com/bids-standard/pyedf/issues" + 
+            " and open an issue there regarding this. Thank you!")
+    finally:
+        file.write(content)


Suggested change

try:

# Py3 onwards bytes and strings are separate data format

content = bytes(content, encoding)

except TypeError:

# Py2.7 Support

pass

except Exception as e:

print(type(e))

print(str(e))

print(

"If you see this message, please go to " +

"https://github.com/bids-standard/pyedf/issues" +

" and open an issue there regarding this. Thank you!")

finally:

file.write(content)

if type(content) != bytes:

content = content.encode(encoding)

file.write(content)

(This works in Python 2.7 and Python 3 -- I haven't tested it in older Python 2, but am not interesting in supporting that!)

Also, note that the finally clause here doesn't make much sense: if there has been a conversion error, then you don't want to try to write the bytes out to file. It would be better to simply leave out the extra except Exception as e and finally clauses and have file.write be after the try block. But in this case, there's an even easier solution (see suggestion).

Since I implemented Error Handling block here, I used finally clause to help people understand the code flow. Since there is only one line left to do after the block, and I know developers who can indent such lines one level inside the except clause thinking it as a mistake on our end. I wish to avoid clearing up such confusion in the future if anyone faces this.

I added the except Exception as e clause to ensure people know what to do if they face such error. I can always do away with this. This is only there for least expected scenarios.

bytes is not a supported type in Python 2.7 so the if clause will not work there. However, as we have already added the support in except TypeError clause, how about we put the content = content.encode(encoding) line in there? This will ensure that encoding is done correctly for Python 2.7.

Python 2.7.17 (default, Jul 20 2020, 15:37:01) [GCC 7.5.0] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> x = "a" >>> type(x) <type 'str'> >>> type(x) == bytes True >>> type(x) == str True

Right. bytes is a synonym for str in Python 2.7.

I mean, if we are to do ascii encoding either way, why not enforce the line in the TypeError block, which will be triggered in Python 2.7 since default str encoding scheme is utf-8.

Okay, I ran a bit of code for testing the error handling block in Python 2.7 and I feel the block is necessary. Run the following example -

# -*- coding: utf-8 -*- enc = 'ascii' x = 'µ Õ ß' print x try: y = x.encode(enc) except Exception as e: print("Error Message: " + str(e)) print("Error Type: " + str(type(e))) y = 'fail' print y print (x == y)

The output will be like the following -

Python 2.7.16 (default, Jul 13 2019, 16:01:51) [GCC 8.3.0] on linux2 µ Õ ß Error Message: 'ascii' codec can't decode byte 0xc2 in position 0: ordinal not in range(128) Error Type: <type 'exceptions.UnicodeDecodeError'> fail False

So errors will occur if there are non-ascii characters, and the code will fail. Now the question is if it is better to let the code fail, or if we should handle the error gracefully and display it to the console for the developer. Let me know your opinion about this @palday .

@palday waiting for your response.

bsandeepan · 2020-08-06T16:13:21Z

I think "Commit change" creates a new commit for every single change, but lots of small commits are fine -- especially because they'll all be squashed during the merge.

Thanks for doing this: I know it's a lot of effort, and we do appreciate this. My review is critical, but it's meant to be constructive.

Please do. I am grateful for your review. It is making me rethink some of my past coding choices and shaping my perspective for future work. I may sometimes come out difficult to communicate, but I am trying to improve my own bugs too here. 😄

Strict review is necessary. Else we get unexpected drivers issues that discourage me from using Ubuntu Linux in my AMD Vega Graphics Workstation. >__>

bsandeepan · 2020-08-12T17:29:23Z

@palday, check the new commits

palday · 2021-02-09T22:53:26Z

I need to write some tests to check this, but it's on my radar.

bsandeepan · 2021-02-10T09:00:57Z

@palday I know we agreed to add the latest documentations in a later PR, but will it help the tests if I add them in this PR?

palday · 2021-03-16T14:48:35Z

Having gone back through the history of this PR .... I'm somewhat surprised that it didn't take advantage of the work already done in #4. But we can't change the past.

I think the best solution for now is to merge this into a py3k branch and provide documentation for installing from that branch for Python 3 support. I'll discuss with @robertoostenveld how to best package up all the work from #4 and here together and hopefully get that back into the trunk branch and set up a PyPI release.

There seem to be some merge conflicts (which I don't quite understand since py3k currently is the same as master and master didn't have merge conflicts), but I'll have to look at those in a few hours and then I'll merge.

bsandeepan added 8 commits September 7, 2019 14:06

Converted Py2.x code to Py3.x compatible code

1974a82

Made EDF.py Py2.7+3.x compatible.

f22c1e9

Modified code syntax to reflect PEP-8 Guidelines

8f596bf

Added documentation + independent module design.

eb768d1

EDF.py is independent module.removed testedf.py.

bf765d5

cleanup

7f117d7

cleanup II

75a6e94

only py3 support

d88f578

palday requested changes Jul 13, 2020

View reviewed changes

palday reviewed Jul 29, 2020

View reviewed changes

= added 3 commits August 12, 2020 22:43

restructured styling. Restructured functions.

d0fda11

Optimization : removed 'if' from writebyte

a983a7c

Edited README

ff41c02

bsandeepan mentioned this pull request Aug 12, 2020

accidental unicode input case handling #11

Closed

bsandeepan requested a review from palday February 8, 2021 08:52

palday changed the base branch from master to py3k March 16, 2021 14:37

palday force-pushed the py3k branch from 349d825 to 9361d60 Compare March 16, 2021 20:52

palday merged commit 00eb640 into bids-standard:py3k Mar 16, 2021

		return (str(buf) + ' ' * num) if (num >= 0) else buf[0:num]


		def writebyte(file, content, encoding='utf8'):

	def writebyte(file, content, encoding='utf8'):
	def writebyte(file, content, encoding='ascii'):

Conversation

bsandeepan commented Jul 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bsandeepan Jul 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bsandeepan Jul 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

palday commented Jul 13, 2020

Uh oh!

bsandeepan commented Jul 27, 2020

Uh oh!

palday commented Jul 27, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bsandeepan commented Aug 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bsandeepan commented Aug 12, 2020

Uh oh!

palday commented Feb 9, 2021

Uh oh!

bsandeepan commented Feb 10, 2021

Uh oh!

palday commented Mar 16, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

bsandeepan commented Jul 12, 2020 •

edited

Loading

bsandeepan Jul 27, 2020 •

edited

Loading

bsandeepan Jul 27, 2020 •

edited

Loading

bsandeepan commented Aug 6, 2020 •

edited

Loading