Skip to content

Chapter 5 data extraction code does not work as Apache mailing list has changed its interface #143

Open
@oonisim

Description

@oonisim

The Apache mailing list has changed its interface, and it is not anymore mod_mbox of Apache HTTP, hence url like http://mail-archives.apache.org/mod_mbox/spark-dev/201911.mbox/ajax/thread?0 will cause the error because of /ajax part.

image

By removing /ajax, the url http://mail-archives.apache.org/mod_mbox/spark-dev/201911.mbox/thread?0 mailing list URL redirect to new interface [email protected], November 2019 but it does not provide MBOX format listing, hence cannot extract the MBOX format elements such as FROM, TO, SUBJECT.

The thread ID pattern is now different too, e.g. https://lists.apache.org/thread/hg85hhvt270of8fdrmb62kfvm7rpl96p.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions