apsahtml2csv(directory, file.name, file.ext = ".htm").csv file is written. These include columns containing the APSA job listing ID number, the date the job advertisement was posted, the type of institution, the title of the position, the start date, salary, and region, the name of the institution and department, the name, address, city, state, ZIP code, and phone number of the individual to contact, the department or institution's web address, and a full paragraph description of the position.The full paragraph description is stored in a column named desc. Due to the current parsing strategy, this field may include some excess characters from the APSA html page.
apsahtml2csv then parses the html code from these pages, and sorts and stores the relevant content. A .csv file is written containing this content.If the user downloads the APSA webpages using a different (or no) file extension, that extension (or "") should be specified using the file.ext argument. Because apsahtml2csv uses the value of file.ext in a grep command, we strongly recommend that the directory specified by directory include only the downloaded webpages, and no other files or directories.
Institutions are inconsistent in how they enter the names of their jobs' contact representatives. Thus, some tweaking of the output of apsahtml2csv may be required in order to create a .csv file that can be seemlessly read by read.murl. Specifically, the user may have to take the single column of the output of apsahtml2csv called contact, and create columns called title, fname, and lname.
read.murl