Database Tutorial and Free VB code
This is a v1.0 draft of a quick
introduction to the tables, queries and VB code section of the Access database
used as a lotto data-mining tool. Although I would agree with the
sentiment that there are no shortcuts in learning programming (like learning
music or Math you see it or you don’t) there are a few automation shortcuts in
the Access database, such as the use embedded SQL (Structured Query Language) in
the VB editor to enable getting things done quickly.
The modern database is built around the relational model, which groups pieces of
data into similar entities as a means to reduce redundant data (an
oversimplification, therefore the "1st draft" label to this
bloogie). Your table(s) will generally model or be defined by what you're
trying to track. In this case it will be a Pick 5 game. At the most
granular level, you are generally asking questions about certain metrics, when
did this # last show up? Or how many times in the last 7 10, 20, … 100 draws?
Once you have built one test, you build another…and so on. Perhaps a new
unexpected insight leads to yet another (15 is picked 5x in the last 6 draws…
what other things happen when that occurs?). I like the following metaphor
to describe the most used parts of the Database: the tables are the chassis of
the car, the queries are the engine, the forms are the doors and/or windows and
the reports are the chrome. This write up will focus on tables and queries
and hopefully work its way to moving the queries to code so folks can build,
test and win. I’ll call it a draft because if there are questions,
I’ll fill in some of the blanks and because this draft is by no means
exhaustive to the topics.
Table Design The rules of table design can be thought of as a
blend of Art and Science, but they “are really just formalized common sense”
as famously quoted by Database guru and author C.J. Date. Although the
most often books list normalization rules up to “third normal form” we’ll
concern ourselves with the first two when modeling/tracking attributes of a pick
5 lotto game. First Normal Form states that data in the tables’ fields
must be atomic, that is broken down to the smallest unit (example: store
someone’s name as last name and first name separately, so if you have to
specify a criteria on a name it is easier to do when one field contains one
value only). Second Normal Form has to be with not allowing redundant
fields in the relational table structure. One will want to employ the
primary key concept, which enforces uniqueness in a column that in turn protects
against duplicate data and/or missing values.
When you create fields for your DB tables, do not include spaces in field names
"FirstName" (NOT "First Name") because when you use the
field in code you'll have to bracket it "[First Name]" otherwise
you'll trigger an error. Similarly don't start your fields with a digit or
those other non Alpha-numeric fields eg. "-" because your code will
think it is the subtraction operator which is likely not your intention.
Basic Queries Most of requested uses of SQL (Structured
Query Language) are straightforward, though there are many esoteric parts that
can get tricky. The basic structure is:
SELECT fieldname1, fieldname2
FROM YourTableName
WHERE fieldname1 = 123
When joining tables in a query, there are different kinds of joins that can be
done that mirror some of the different kinds of relationships among tables and
other queries that typically exist in a database (one to one, one to many, many
to many is not supported directly; it requires an intermediate table for a
lookup work-around).
There are basic SELECT queries and Action queries (INSERT, APPEND, UPDATE,
DELETE). You're not required to capitalize the Keywords, it is just good
practice for the sake of readability; you are doing yourself a favor down the
line when you return to the code and you want to quickly figure out its meaning
without draining your brain of horsepower at 3PM or 3AM. There are also
Aggregate queries, which allow you to easily get Sums, Averages, Min, Max,
Variance, Standard Dev; your basic "Descriptive Statistics" functions
built in. You can call your own user defined function, provided that it is
already saved in your Module window of your database.
When learning how to work with queries, I'd strongly recommend going to the pull
down "View" menu and select "SQL" view so you can read what
you are doing, it is pretty close to standard English. Once familiar with
SQL it can be placed in the VB Editor as embedded SQL, which can be a fast
method for automating various routines and functions. Below is an example
that can be strung together as part of a larger program to suit your needs.
' a brief example of SQL embedded in the VB editor; an update query is shown
Docmd.RunSQL "UPDATE tblData SET D1 = 2 WHERE D1=1;"
The UNION query is also a very useful construct. It allows you to append
one recordset to another, provided that both recordsets have the same fields, in
the same order and are of the same datatype.
UNION ALL will include duplicates in the resulting recordset, where UNION will
quietly not include duplicate records. You could add a WHERE clause to
filter this by Date range, for example.
'save this query as "UNION_ALLDraws_1col"
SELECT tblData.D1
FROM tblData
UNION ALL
SELECT tblData.D2
FROM tblData
UNION ALL
SELECT tblData.D3
FROM tblData
UNION ALL
SELECT tblData.D4
FROM tblData
UNION ALL
SELECT tblData.D5
FROM tblData;
' this returns the
frequency count of all 39 numbers , when query above has been named as
instructed.
SELECT UNION_ALLDraws_1col.D1, Count(UNION_ALLDraws_1col.D1) AS CountOfD1
FROM UNION_ALLDraws_1col
GROUP BY UNION_ALLDraws_1col.D1;
Variables
in your code For this introductory article, I think we’ll keep
the variables inside the function (“at the procedure level”) in which they
operate to be able to skim over global variables. The time during which a
variable retains its value is known as its lifetime. The value of a
variable may change over its lifetime, but it retains some value. When a
variable loses scope, it no longer has a value.A procedure-level variable
declared with the Dim statement retains a value until the procedure is finished
running. If the procedure calls other procedures, the variable retains its value
while those procedures are running as well. If a procedure-level variable is
declared with the Static keyword, the variable retains its value as long as code
is running in any module. When all code has finished running, the variable loses
its scope and its value. Its lifetime is the same as a module-level variable.
Also always have "Option Explicit" at the top of each code or
form module. Otherwise you will think that a variable is in scope because the
code compiles, whereas what is acually happening is that a new (local) variable
has been created on the fly.
DAO / ADO
("Data Access Objects", the old object model three generations prior,
pre .NET)
Although this is the old way of doing things, I believe this syntax is easier to
memorize and therefore get up to speed with. Make sure to have a reference
in the References dialog box set to Microsoft DAO 3.6 or 3.51 library for the
compiler to recognize the code. Below is a very basic loop that can be
built upon after having some practice with it. The ADO (ActiveX Data
Objects) I wrote about 5 years ago has some sloppiness or bad habits, but it
works and at the time I wanted to quickly prototype things and move on to the
next. I did put in the time to learn ADO because I was going to jump into
the ASP.NET/ADO.NET technology. It is a Rapid Application Development (RAD)
technology that builds Web pages that can communicate with a back-end Database
(it's optimized for SQL Server 2005). Access wasn't designed for web use
or multi-user use; you'll need to upgrade to SQL Server 2005… I think MSFT
also has free, scaled down version of SQL available to compete with the
“free” stuff out there. I like all the productivity tools with Visual
Studio 2005 Professional so it makes the work rather fun and very efficient.
'you could assign D1 in the loop or you could call a user defined function
'this code assumes you have a tblData table or query in your DB or the code will
'trigger a run-time error; error handling may be covered in a later draft
Sub BasicRecordsetLoop()
Dim db As Database, rs As Recordset
Set db = CurrentDb 'use the current DB
Set rs = db.OpenRecordset("SELECT D1, D2 FROM tblData;")
With rs
rs.movefirst
do
rs.edit ''' you'll get a run
time error if there is rs.edit without rs.update
rs!D1 = rs!D1 * 1.1
rs.update ' change
won't happen if rs.update is omitted
rs.movenext
loop until rs.eof 'eof = End of File
End With
Set rs = Nothing ' free up memory
End Sub
'Tables tblCtAll6, tblCtAll10, tblCtAll20 must exist in the DB and the fields
will be Date, 1, 2, 3… 39 for the loop in subroutine "LoopCounts"
going up to 39. Datatype for digits should be Number. This code
will get the frequency count of numbers 1, 2, 3 … 39 in the last 6, 10, 20
draws in an efficient manner (i.e. look at the line of code that begins "rs(strFld)=".
Sub SubRoutineThatCallsAnotherSub()
'this one passes an argument for code re-useability a design goal as you build
things...
Call LoopCounts(6)
Call LoopCounts(10)
Call LoopCounts(20)
End Sub
Sub LoopCounts(intX As Integer)
Dim db As Database, rs As Recordset, strFld As String, i As Integer, strNum As
String, strTable As String
strTable = "tblCtAll" & CStr(intX)
Set db = CurrentDb
Set rs = db.OpenRecordset("SELECT *, Date FROM " & strTable &
" ORDER BY Date DESC;") '
With rs
rs.MoveFirst
Do
i = 1
Do
strFld = CStr(i)
rs.Edit
rs(strFld) = fxCountXBack(i, intX, CDate(rs!Date))
'
rs.Update
i = i + 1
Loop Until i > 39
rs.MoveNext
Loop Until rs.EOF
End With
Set rs = Nothing
End Sub
'below is some sloppy ADO (written 5 or 6 years ago) that is called in the sub
above
'Please forgive the messiness below
Function fxCountXBack(Num As Integer, ItNum As Integer, dtDate As Date) As
Integer
'counts how many times each number has been used in last 100 draws
Dim cnn1 As ADODB.Connection, StrRS100 As String
Dim rs100 As ADODB.Recordset, SumALL As Integer
Dim D1 As Long, D2 As Long, D3 As Long, D4 As Long, D5 As Long
Set cnn1 = New ADODB.Connection
Set cnn1 = CurrentProject.Connection
Set rs100 = New ADODB.Recordset
StrRS100 = "SELECT tblData.Date, tblData.D1, tblData.D2, tblData.D3,
tblData.D4, tblData.D5" & _
" FROM tblData WHERE (((tblData.Date) Between #" & dtDate &
"# And #" & dtDate & "# -" & ItNum &
")) " & _
"ORDER BY tblData.Date DESC"
rs100.Open StrRS100, cnn1
'Debug.Print StrRS100
rs100.MoveFirst
SumALL = 0
Do
If Num = rs100("D1") Then
SumALL = SumALL + 1
ElseIf Num = rs100("D2")
Then
SumALL = SumALL + 1
ElseIf Num = rs100("D3")
Then
SumALL = SumALL + 1
ElseIf Num = rs100("D4")
Then
SumALL = SumALL + 1
ElseIf Num = rs100("D5")
Then
SumALL = SumALL + 1
End If
rs100.MoveNext
Loop Until rs100.EOF
fxCountXBack = SumALL
End Function