ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Basic Web Automation for Collecting Activity

    Scheduled Pinned Locked Moved IT Discussion
    bashscreen scrapingweb automationspiceworks
    2 Posts 1 Posters 946 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • scottalanmillerS
      scottalanmiller
      last edited by

      Some people have asked for this and it has been a while since I knew where the code was stored but I stumbled on it the other day and figured that I would share. People are often curious how practical screen scraping is done. This is a very crufty example but it works and is easily automatable (throw it into cron and you are done.) No special tools needed:

      #!/usr/bin/bash
      #
      # Version 1.0
      # 11 January 2014
      #
      # Abstract: To automate the collection of Spiceworks activity data at the current time through the direct querying of the SW Community
      
      ##  Initialize temp file
      echo "" > /tmp/swreport
      
      ## Generate Data
      for i in $(cat /opt/scripts/staff); do
          echo $i $(curl -b /opt/scripts/cookies.txt http://community.spiceworks.com/people/$i/activity 2>/dev/null| grep Points \
          | cut -d">" -f 3 | cut -d "<" -f1) $(curl -b /opt/scripts/cookies.txt http://community.spiceworks.com/people/$i/activity \
          2>/dev/null| grep Answer | cut -d">" -f 3 | cut -d "<" -f1)  $(curl -b /opt/scripts/cookies.txt \
          http://community.spiceworks.com/people/$i/activity 2>/dev/null| grep Posts| cut -d">" -f 3 | cut -d "<" -f1) \
          $(curl -b /opt/scripts/cookies.txt http://community.spiceworks.com/people/$i/activity 2>/dev/null |  grep '"title"' | \
          cut -d"<" -f2 | cut -d">" -f2) | sed 's/,//g' >> /tmp/swreport
      done
      
      ## Format Report
      echo "Screenname Points BAs HPs Pepper" > /tmp/swreport.sorted
      echo " " >> /tmp/swreport.sorted
      sort -k2nr /tmp/swreport >> /tmp/swreport.sorted
      
      column -c 4 -t -s $' ' /tmp/swreport.sorted > /tmp/swreport.col
      
      echo "This daily report of Spiceworks standings is generated automatically by the swreport.sh script on to-lnx-dev. \
            This report is created by directly querying the Spiceworks Community at the time of creation and is completely \
            up to date at creation time." > /tmp/swreport.for
      echo "" >> /tmp/swreport.for
      cat /tmp/swreport.col >> /tmp/swreport.for
      
      ## Send Out Report
      mail -s "Spiceworks Daily Report - Straight from the Server" [email protected] < /tmp/swreport.for
      

      The script is quick and dirty with hard coded locations. It requires the text file /opt/scripts/staff to contain the list of user names to query. Add as many names to the list as you want in the report.

      1 Reply Last reply Reply Quote 0
      • scottalanmillerS
        scottalanmiller
        last edited by

        Before this script will run properly, you have to use curl and a valid account to acquire a cookie to pass with the script, as well.

        1 Reply Last reply Reply Quote 0
        • 1 / 1
        • First post
          Last post