Hello,
My knowledge of VBA has limited my ability to do something, so I find myself here.
A few years ago, I used querytables (code lifted from a macro recording) to download table data from a website (baseball-reference.com). I downloading league-wide batting, pitching, and fielding statistics for every year, and the code worked beautifully, because I only had to specifiy the URL and table name in the code:
Code:
With Worksheets("Sheet1").QueryTables.Add(Connection:= "LINK", Destination:=Worksheets("Sheet1").Range("$A$1"))
The above is an example which will get you the data from the fielding table for first basemen in 1956. By substituting variables for the years and positions, I could get all data from every year, because despite the fact that the tables sometimes changed from page to page (meaning some columns were added to the pages for later years), the code simply pulled all data regardless.
This no longer works, as best I can tell, because I believe the table properties have been modified or changed. My queries now return nothing for the tables I'm considering. However, recording a macro for extracting data from web gives me code that returns the data as a listobject:
Code:
ActiveWorkbook.Queries.Add Name:="Player Fielding - 2B Table", Formula:= _
"let" & Chr(13) & "" & Chr(10) & " Source = Web.Page(Web.Contents(""LINK""))," & Chr(13) & "" & Chr(10) & " Data5 = Source{5}[Data]," & Chr(13) & "" & Chr(10) & " #""Changed Type"" = Table.TransformColumnTypes(Data5,{{""Rk"", Int64.Type}, {""Name"", type text}, {""Age"", Int64.Type}, {""Tm"", type text}, {""Lg"", type text}, {""G"", Int64.Type}, {""GS"", Int64.Type}" & _
", {""CG"", Int64.Type}, {""Inn"", type number}, {""Ch"", Int64.Type}, {""PO"", Int64.Type}, {""A"", Int64.Type}, {""E"", Int64.Type}, {""DP"", Int64.Type}, {""Fld%"", type number}, {""Rtot"", Int64.Type}, {""Rtot/yr"", Int64.Type}, {""Rtz"", Int64.Type}, {""Rdp"", Int64.Type}, {""RF/9"", type number}, {""RF/G"", type number}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & " #""Changed Type"""
ActiveWorkbook.Worksheets.Add
With Worksheets("Sheet2").ListObjects.Add(SourceType:=0, Source:= _
"OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Player Fielding - 2B Table"";Extended Properties=""""" _
, Destination:=Worksheets("Sheet2").Range("$A$1")).QueryTable
.CommandType = xlCmdSql
.CommandText = Array("SELECT * FROM [Player Fielding - 2B Table]")
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.PreserveColumnInfo = True
.ListObject.DisplayName = "Player_Fielding___2B_Table"
.Refresh BackgroundQuery:=False
End With
Now this works, but each column heading must be specified and when it runs into a table that is missing a column, or has an extra, it throws an error. This means I can't use this code for all the pages for all the years, and instead would have to constantly modify it. What I'd like to do is to be able to grab all of the data from these tables without specifying column headers, and was wondering if there was a way I could still do this, like the good ol' days. Seems there should be a way...any thoughts?
Thanks in advance.