h2. ReadMe
--
.ruby-gemset: should be - saas_media_ad_app normally.
--
Web crawling app, currently focused on crawling FB and YT videos.
Using Ruby 2.2.X (2.2.5 as of this post) or 2.3.X (2.3.1 as of this post). Have moved all development to Ruby 2.3.X, 2.3.1 as of now, to be consistent, but as of the end of August, Ruby 2.2.5 was working completely fine. I'm assuming Ruby 2.0.X and Ruby 2.1.X can easily be used too.
Rails 5 app. Though the primary meat of the application is based around normal Ruby (with a few Rails variables and database models being intermixed in the code) and Watir (Ruby gem).
Using the latest of everything available as of the end of 2016.
Only works on Nix right now. Probably could work on Windows, but would require changes. Instructions for Linux. Mac is essentially the same, use MacPorts, Homebrew, or a similar package manager. And no xvfb or headless for Mac. App has been tested to work on both platforms as of early September 2016. Since then, it has almost exclusively been tested on Linux only.
h3. Should have locally installed: Following are because of Ruby library (gem) dependencies. From apt-get or similar package manager
-
git
-
xvfb
-
Image Magick
-
ffmpeg (though just barely needed and perhaps something else could suffice - for video durations)
-
pip (python)
-
wget
-
curl
-
Chrome (might need to be added via 3rd party)
-
Chromedriver (installer for Linux included in lib/scripts. For Mac, on Homebrew, if not more)
-
PhantomJS
-
Firefox (back up)
-
Possibly could use Chromium and Chromium-Chromedriver instead
-
cron
-
MySQL is the database. I'm sure Percona, MariaDB would work. Haven't verified.
-
Apache is the primary web server with Passenger being used. Though anything should work.
-
Install youtube-dl manually: sudo curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl sudo chmod a+rx /usr/local/bin/youtube-dl
-
Update youtube-dl regularly via: sudo pip install --upgrade youtube-dl
-
sudo apt install python-pip python-dev
-
Setup papertrails separately as well if going to be using it: http://help.papertrailapp.com/kb/configuration/configuring-centralized-logging-from-ruby-on-rails-apps/
- remote_syslog2 most likely
- entering log files in log_files.yml in /etc/log_files.yml if using remote_syslog2
h3. .env keys
prob or for sure needed:
SENTRY_DSN
aws_s3_access_key_id aws_s3_secret_access_key
yt_key yt_search_key (i don't think so any more)
server_id
current_user (current nix user)
ROOT_URL (for LP ripping) LINUX_ROOT_URL (for LP ripping) MAC_ROOT_URL (if using Mac)
DEVISE_SECRET_KEY (i think)
BLAZER_DATABASE_URL (not being used though)
h3. cron jobs via whenever gem
whenever gem looks at config files.
config/schedule.rb is the default location. But because there's 3 to 4 main parts to the app, there's 3 schedule files:
- schedule-app.rb
- schedule-fb.rb
- schedule-yt.rb
-
Puma is in the gem file and can be used for quick set up
-
@TODO Need to get initial data covered
-
@TODO example of the ENV variables needed
-
Using /opt/ right now. Symlinking from there. Not needed, but how the app works right now.
-
No official test suite as of right now. There are unofficial ones that require manually reviewing work. @TODO explain this
-
@TODO Deployment instructions
h3. File stuff to do:
Create: /opt/srv/afma-v/d/ Then:
- /opt/srv/afma-v/d/af
- /opt/srv/afma-v/d/ay
- /opt/srv/afma-v/d/l
- /opt/srv/afma-v/d/headless
- /opt/srv/afma-v/d/headless/0
- /opt/srv/afma-v/d/headless/1
- /opt/srv/afma-v/d/headless/2
- /opt/srv/afma-v/d/headless/l
Uh maybe also?
- /opt/srv/afma-v/headless/{0,1,2,l}
If you so choose, also testing ones:
- /opt/srv/afma-v/d/af_testing
- /opt/srv/afma-v/d/ay_testing
- /opt/srv/afma-v/d/l_testing
Symlink each into #{ Rails.root }/public/d/ Symlink headless into #{ Rails.root }/public
h3. Non-version controlled files:
- .env
- config/database_yml
- .ruby-gemset
- config/schedule.yml
--
the groupdate and maybe another chart gem needs mysql timezones mysql_tzinfo_to_sql /usr/share/zoneinfo | mysql -u root mysql -p
--
As of 2016-12-02, FB stuff has to be using user agent Firefox or IE. Perhaps Safari or other browsers as well. Haven't checked. This isn't likely to work for too long. But here's to hoping.
--
Updated 2017-10
h2. Terminology
element-children-methods: the methods don't only define elements, but mostly depend on a method that did define a browser element