1
00:00:10,376 --> 00:00:11,306
>> Alright, so welcome back

2
00:00:11,306 --> 00:00:14,946
to computer science S-75
this is lecture 4 on sequel.

3
00:00:15,436 --> 00:00:20,926
So, what did we do Monday?

4
00:00:21,066 --> 00:00:22,856
Anyone? XML, alright so good.

5
00:00:22,856 --> 00:00:25,666
So, we talked about XML stands
for extensible markup language

6
00:00:25,666 --> 00:00:28,156
and in one sentence what is
it or what is it good for?

7
00:00:28,586 --> 00:00:28,706
Yes.

8
00:00:30,466 --> 00:00:34,116
>> It's good for structuring
data in an easy way.

9
00:00:34,346 --> 00:00:36,426
>> Okay, good for
structuring data in an easy way

10
00:00:36,716 --> 00:00:38,136
and we looked at a few examples.

11
00:00:38,136 --> 00:00:40,936
We looked at how you might
structure lectures for instance.

12
00:00:40,936 --> 00:00:44,296
We looked at how you might
structure a pizza menu

13
00:00:44,296 --> 00:00:48,416
or at least the beginnings
thereof, and what are some

14
00:00:48,416 --> 00:00:50,596
of the design constraints
on an XML file?

15
00:00:50,596 --> 00:00:54,046
What features can you have or
what features can you not have?

16
00:00:54,966 --> 00:00:55,616
How about someone else?

17
00:00:55,616 --> 00:00:55,736
Yeah.

18
00:00:56,166 --> 00:00:57,366
>> Can only have one root.

19
00:00:57,756 --> 00:00:59,346
>> Can only have one
root element, okay good.

20
00:00:59,346 --> 00:00:59,976
What's your name again?

21
00:01:00,166 --> 00:01:00,396
>> Louis [assumed spelling].

22
00:01:00,516 --> 00:01:01,096
>> Louis, okay.

23
00:01:01,096 --> 00:01:02,956
So, you can only have
one root element.

24
00:01:02,956 --> 00:01:05,146
You can have stuff at the top
of the file like comments,

25
00:01:05,476 --> 00:01:08,096
you can have the XML
declaration which his optional

26
00:01:08,096 --> 00:01:10,256
so there might be some
stuff up there and all

27
00:01:10,256 --> 00:01:12,606
of that hangs below the
so-called document note

28
00:01:12,806 --> 00:01:15,246
and that document note could
become more relevant once we get

29
00:01:15,246 --> 00:01:17,086
the JavaScript and even
if you're already familiar

30
00:01:17,086 --> 00:01:18,606
with some JavaScript
you might know

31
00:01:18,606 --> 00:01:21,296
that there's this global
object known as document

32
00:01:21,356 --> 00:01:23,876
in all lower case if you've
programed in JavaScript before.

33
00:01:23,876 --> 00:01:25,346
So, more on that to come.

34
00:01:25,346 --> 00:01:27,496
Alright, so with XML what
are some other features

35
00:01:27,496 --> 00:01:29,776
of this markup language?

36
00:01:30,386 --> 00:01:33,826
Yeah, Axel.

37
00:01:34,186 --> 00:01:36,816
>> I really don't know if
[inaudible] is a feature,

38
00:01:37,116 --> 00:01:38,836
but elements can
have attributes.

39
00:01:38,906 --> 00:01:40,496
>> Okay, so elements
can have attributes.

40
00:01:40,496 --> 00:01:42,336
So, if you want to
markup some data

41
00:01:42,336 --> 00:01:44,496
but then associate some
additional data with it,

42
00:01:44,756 --> 00:01:46,356
you can have these
things called attributes

43
00:01:46,356 --> 00:01:49,346
and an attribute is
usually key equals "value",

44
00:01:49,346 --> 00:01:52,866
you can have multiple
attributes on an XML element,

45
00:01:52,866 --> 00:01:55,556
you can have zero
attributes on an XML element,

46
00:01:55,556 --> 00:01:58,416
you can't have identically named
attributes on the same element

47
00:01:58,686 --> 00:02:01,216
and you have to have those
attributes values quoted,

48
00:02:01,546 --> 00:02:05,946
but besides that you at
least have an alternative

49
00:02:06,206 --> 00:02:06,876
to this approach.

50
00:02:06,876 --> 00:02:09,546
What's the alternative to making
some piece of data an attribute

51
00:02:10,346 --> 00:02:12,186
in terms of the design
of an XML file?

52
00:02:12,366 --> 00:02:12,466
Yeah.

53
00:02:13,066 --> 00:02:15,056
>> Making children
elements within the element.

54
00:02:15,246 --> 00:02:15,816
>> Yeah, exactly.

55
00:02:15,816 --> 00:02:18,286
Making children of an
element is an alternative.

56
00:02:18,286 --> 00:02:19,486
So, what's an argument for

57
00:02:19,486 --> 00:02:21,036
or against either
of these approaches?

58
00:02:21,326 --> 00:02:23,476
Attributes versus children?

59
00:02:24,036 --> 00:02:31,336
Someone else, otherwise it's
just going to me and Axel today.

60
00:02:31,516 --> 00:02:33,686
Yes, Axel in the front.

61
00:02:33,896 --> 00:02:37,356
>> Okay, if the data itself
doesn't have a child you might

62
00:02:37,406 --> 00:02:38,306
want to use an attribute.

63
00:02:38,436 --> 00:02:38,656
>> Okay.

64
00:02:39,056 --> 00:02:42,476
>> But if it would an
additional child then you have

65
00:02:42,476 --> 00:02:42,926
to use an element.

66
00:02:43,076 --> 00:02:43,536
>> Okay, good.

67
00:02:43,536 --> 00:02:45,136
So, if you think that
your piece of data

68
00:02:45,136 --> 00:02:46,936
that you're representing
might at some point need

69
00:02:46,936 --> 00:02:50,036
to be extended in the same
way that a person was extended

70
00:02:50,036 --> 00:02:53,246
in our simple little Amazon
purchase order example whereby

71
00:02:53,246 --> 00:02:56,396
we added an initial, we added
an address then maybe it should

72
00:02:56,396 --> 00:02:58,196
indeed be a child element
so that you've reserved

73
00:02:58,196 --> 00:02:59,566
that flexibility for yourself.

74
00:02:59,566 --> 00:03:02,886
If instead you make is an
attribute, that's pretty much a,

75
00:03:02,886 --> 00:03:05,676
the buck stops there, you can't
further extend the definition

76
00:03:05,906 --> 00:03:08,286
so attributes tend to be
good for small pieces of data

77
00:03:08,286 --> 00:03:11,316
like IDs, or short
words, or phrases,

78
00:03:11,506 --> 00:03:12,656
but you wouldn't typically put

79
00:03:12,656 --> 00:03:16,586
for instance a paragraph
long description of a pizza

80
00:03:16,666 --> 00:03:18,676
for instance in an
attribute, you could,

81
00:03:18,936 --> 00:03:20,526
but at that point it
just doesn't lend itself

82
00:03:20,526 --> 00:03:23,056
to readability then it gets
harder to have quote marks

83
00:03:23,166 --> 00:03:25,856
in the attributes value,
so in short this is one

84
00:03:25,856 --> 00:03:27,736
of those things that you
just kind of start to know

85
00:03:27,736 --> 00:03:30,486
when you see it, but when
it comes to your pizza menu,

86
00:03:30,726 --> 00:03:33,046
recall that going to want to
be careful with the design

87
00:03:33,046 --> 00:03:36,286
because if you for instance
create a pizza child;

88
00:03:36,286 --> 00:03:39,056
what was one of the downsides
even though we talked very

89
00:03:39,056 --> 00:03:41,836
briefly about it on Monday,
what was the downside arguably

90
00:03:41,836 --> 00:03:44,486
of coming up with your
own pizza element?

91
00:03:44,486 --> 00:03:44,636
Yeah, Jack.

92
00:03:44,636 --> 00:03:46,656
>> Well, if you want
to use the same thing

93
00:03:46,656 --> 00:03:47,566
to find the pizza element,

94
00:03:47,646 --> 00:03:50,856
to find let's say
something in your sandwiches.

95
00:03:50,856 --> 00:03:50,966
>> Okay.

96
00:03:50,966 --> 00:03:53,626
>> You're not going to be
able to do the same thing

97
00:03:53,626 --> 00:03:56,046
because they're not
the same element.

98
00:03:56,046 --> 00:03:56,566
>> Exactly.

99
00:03:56,566 --> 00:03:58,056
If you want to find
two different things

100
00:03:58,056 --> 00:04:00,626
in the file you have to know in
advance what those things are.

101
00:04:00,626 --> 00:04:01,956
You have to know that
there's a pizza element,

102
00:04:01,956 --> 00:04:03,966
you have to know there's a
salad element, you have to know

103
00:04:03,966 --> 00:04:07,696
that there's a grinder
element and what not and so,

104
00:04:07,696 --> 00:04:09,626
you've written now
a lot of PHP code

105
00:04:09,786 --> 00:04:12,716
that has these keywords
hardcoded.

106
00:04:13,226 --> 00:04:16,456
Now, that's fine if the pizza
guy wants to add more pizzas

107
00:04:16,456 --> 00:04:18,166
to menu or add more salads,

108
00:04:18,576 --> 00:04:21,866
but under what scenario
would this design break?

109
00:04:23,116 --> 00:04:26,376
And by break I mean you the
developer have to go back

110
00:04:26,516 --> 00:04:29,996
to the pizza shop and
change actual PHP code

111
00:04:29,996 --> 00:04:31,166
as opposed to XML.

112
00:04:31,366 --> 00:04:32,456
>> They want a whole new item.

113
00:04:32,456 --> 00:04:33,296
>> They want a whole item.

114
00:04:33,296 --> 00:04:35,446
They want to sell ice cream
or frozen yogurt or something

115
00:04:35,446 --> 00:04:38,286
that just wasn't anticipated
by you and so now you need

116
00:04:38,286 --> 00:04:43,626
to have a FROZEN yogurt
element and hardcode that,

117
00:04:43,826 --> 00:04:46,826
so there again is sort
of the manifestation

118
00:04:46,826 --> 00:04:49,156
of a poor design decision
and so we started talking

119
00:04:49,156 --> 00:04:50,586
about alternatives
to this, and again,

120
00:04:50,786 --> 00:04:52,866
I disclaim there's no one right
way so you don't just have

121
00:04:52,866 --> 00:04:55,186
to copy what we started to
do in Notepad that other day,

122
00:04:55,636 --> 00:04:59,376
but instead we have
category elements

123
00:04:59,376 --> 00:05:01,756
and we had item elements
and we put the names

124
00:05:01,756 --> 00:05:03,636
for those categories
and items in that case

125
00:05:03,916 --> 00:05:06,816
in the attribute,
in attribute values.

126
00:05:06,816 --> 00:05:08,106
We could have done
child elements

127
00:05:08,336 --> 00:05:10,736
but at least there we had a
more general design so that

128
00:05:10,736 --> 00:05:12,826
if you're writing PHP code
as you will be this week

129
00:05:12,826 --> 00:05:14,826
and this weekend, you
can have a for loop

130
00:05:14,826 --> 00:05:18,146
or a while loop that's iterating
over all of the item elements

131
00:05:18,146 --> 00:05:20,816
in the DOM and you don't
have to worry too much

132
00:05:20,816 --> 00:05:22,876
about whether it's a pizza or
a salad cause you can figure

133
00:05:22,876 --> 00:05:24,716
that out on the fly, so again,

134
00:05:24,716 --> 00:05:27,046
that's the thought process
you should have when sketching

135
00:05:27,046 --> 00:05:28,786
out your own XML file.

136
00:05:29,426 --> 00:05:32,656
Alright. Alright, so just
a word on Project Zero,

137
00:05:32,656 --> 00:05:36,546
so do bear in mind on page
1 of the project spec,

138
00:05:36,546 --> 00:05:38,376
as well as in the syllabus
is the specification

139
00:05:38,376 --> 00:05:40,386
for the course's policies
on academic honesty

140
00:05:40,566 --> 00:05:42,396
and they essentially boil
down to this; you're welcome

141
00:05:42,396 --> 00:05:45,336
and encouraged to collaborate
verbally with each other,

142
00:05:45,336 --> 00:05:47,166
stand in front of a whiteboard
if you've gotten friendly

143
00:05:47,166 --> 00:05:49,506
with classmates and talk through
various design decisions,

144
00:05:49,836 --> 00:05:51,286
but when it comes
time to write code

145
00:05:51,286 --> 00:05:53,436
that should be done
completely separately

146
00:05:53,436 --> 00:05:55,066
so that everyone
ultimately is submitting his

147
00:05:55,066 --> 00:05:57,576
or her own projects, so
discussions in English good,

148
00:05:57,576 --> 00:05:59,606
in pseudocode good, in code bad.

149
00:05:59,926 --> 00:06:02,506
So, just bear in mind that and
read through it in more detail.

150
00:06:03,146 --> 00:06:06,486
Alright, so that was XML and
what was the point of XML?

151
00:06:06,486 --> 00:06:08,626
Well, we'll see today
an alternative

152
00:06:08,626 --> 00:06:09,826
to something like XML.

153
00:06:09,826 --> 00:06:11,656
We'll actually look at
databases and we'll look

154
00:06:11,656 --> 00:06:14,106
at a language called SEQUEL,
structured query language

155
00:06:14,106 --> 00:06:16,716
that allows you to express
yourself even more powerfully

156
00:06:16,716 --> 00:06:20,776
than you can with XPath,
whereby XPath refers to what?

157
00:06:20,776 --> 00:06:21,266
Ah, Jack.

158
00:06:23,776 --> 00:06:28,736
>> PHP method for going
into and finding XML files.

159
00:06:28,736 --> 00:06:31,686
>> Good, the PHP method
for going through XML files

160
00:06:31,686 --> 00:06:33,486
and finding things
and just to be clear,

161
00:06:33,486 --> 00:06:35,736
it's not a PHP feature
per se that happens

162
00:06:35,736 --> 00:06:38,516
to be PHPs implementation
thereof, it's a feature

163
00:06:38,516 --> 00:06:42,546
of PHP SimpleXML API, but
XPath is language independent

164
00:06:42,546 --> 00:06:45,836
and indeed there exist
XPath processors for Java,

165
00:06:45,836 --> 00:06:48,236
for C++ for all sorts
of languages,

166
00:06:48,446 --> 00:06:52,906
so XPath itself is a language
separate from SEQUEL, from PHP,

167
00:06:53,196 --> 00:06:55,806
from all of these other
alternative languages,

168
00:06:55,906 --> 00:06:58,526
so we looked at a few
features of XPath now.

169
00:06:58,756 --> 00:07:02,496
So, we had these location paths
that kind of feel a little

170
00:07:02,756 --> 00:07:04,346
like C:/ program
files or what not,

171
00:07:04,346 --> 00:07:07,486
so something a little familiar
to people but what kinds

172
00:07:07,486 --> 00:07:11,426
of features does an
XPath expression have,

173
00:07:11,426 --> 00:07:13,246
an XPath location path?

174
00:07:15,176 --> 00:07:18,266
We had these things called
steps and a step is kind

175
00:07:18,266 --> 00:07:20,546
of like the word implies, it
allows you to start at the root

176
00:07:20,546 --> 00:07:23,236
of the document and dive in
deeper and deeper and deeper

177
00:07:23,236 --> 00:07:26,386
like taking steps into the
tree, what else did steps do?

178
00:07:26,386 --> 00:07:27,396
>> Yeah, you can also
say which attribute

179
00:07:31,216 --> 00:07:31,586
in the certain step [inaudible].

180
00:07:31,586 --> 00:07:32,026
>> Exactly.

181
00:07:32,026 --> 00:07:34,856
If you want to look at an
attribute's value like we did

182
00:07:34,856 --> 00:07:36,396
for the lecture number

183
00:07:36,396 --> 00:07:39,316
in our very short example
involving a generation

184
00:07:39,316 --> 00:07:40,696
of an unordered HTML list of all

185
00:07:40,696 --> 00:07:42,546
of the lectures we've
had thus far,

186
00:07:42,696 --> 00:07:47,166
we grabbed the lecture number
attributes using @number

187
00:07:47,356 --> 00:07:50,116
for instance or rather that's
what it would be in XPath,

188
00:07:50,116 --> 00:07:51,986
we happen to do that
particular example in PHP

189
00:07:51,986 --> 00:07:54,446
where we use the
square bracket notation.

190
00:07:54,716 --> 00:07:56,566
So, just realize we looked at
a couple of different things

191
00:07:56,566 --> 00:07:59,616
in PHP we had the
SimpleXML API that allows you

192
00:07:59,616 --> 00:08:02,186
to use the arrow operator,
so hyphen angled bracket

193
00:08:02,396 --> 00:08:05,466
to stepdown into a child
and XPath allows you

194
00:08:05,466 --> 00:08:08,216
to do essentially the same
thing but more powerfully,

195
00:08:08,326 --> 00:08:11,446
it's more of an, it's a more
expressive language than just go

196
00:08:11,446 --> 00:08:13,446
to a child or go to an attribute

197
00:08:13,566 --> 00:08:15,726
because even though we just
scratched the surface or recall

198
00:08:15,726 --> 00:08:19,826
that XPath has multiple axes
and an axis is something

199
00:08:19,826 --> 00:08:22,566
like child colon, colon
or attribute colon,

200
00:08:22,566 --> 00:08:25,756
colon and those are generally
not even written explicitly just

201
00:08:25,756 --> 00:08:27,146
cause they would be
incredibly tedious

202
00:08:27,146 --> 00:08:29,416
to write XPath expressions
with those long words,

203
00:08:29,666 --> 00:08:33,316
but there are things like
decedents, or ancestor,

204
00:08:33,316 --> 00:08:36,986
or sibling and those might seem
kind of unnecessary right now,

205
00:08:37,206 --> 00:08:39,796
but indeed there are
scenarios where it's useful.

206
00:08:39,796 --> 00:08:42,946
If you find yourself navigating
for instance your XML file

207
00:08:43,166 --> 00:08:45,476
and you're at a pizza element
and for whatever reason you want

208
00:08:45,476 --> 00:08:48,306
to get all of the other pizza
elements at that same level

209
00:08:48,386 --> 00:08:50,386
in the tree, that
level in the DOM,

210
00:08:50,576 --> 00:08:53,046
you can use the sibling
axis to just get all

211
00:08:53,046 --> 00:08:54,746
of those pizza elements
for instance

212
00:08:54,986 --> 00:08:57,476
and more powerfully
think about this way,

213
00:08:57,476 --> 00:09:01,196
when you're implementing your
pizza ML E-Commerce sites

214
00:09:01,666 --> 00:09:05,226
and a user has added an item
to his or her shopping cart;

215
00:09:05,426 --> 00:09:07,366
let's give some thought
as to what should go

216
00:09:07,366 --> 00:09:08,566
in the shopping cart now.

217
00:09:09,086 --> 00:09:10,706
So, the shopping
cart is implemented

218
00:09:10,706 --> 00:09:14,406
in code via what PHP feature?

219
00:09:14,586 --> 00:09:15,836
Yeah, Jack.

220
00:09:15,836 --> 00:09:16,116
>> Session.

221
00:09:16,116 --> 00:09:16,826
>> Yeah, the session.

222
00:09:17,006 --> 00:09:20,296
So, the session is the
super global that allows you

223
00:09:20,296 --> 00:09:24,046
to illusion of state with the
user and by illusion I mean

224
00:09:24,046 --> 00:09:26,096
that even http is stateless,

225
00:09:26,316 --> 00:09:29,196
as soon as you stop requesting
an HTML file cause it's been

226
00:09:29,196 --> 00:09:31,736
downloaded, that's it, the
network connection does not stay

227
00:09:31,936 --> 00:09:34,116
on the present, but there's
this thing of cookies

228
00:09:34,116 --> 00:09:38,156
and this handstamp metaphor
whereby you are reminding the

229
00:09:38,156 --> 00:09:39,696
server who you are
so the server can

230
00:09:39,696 --> 00:09:41,216
in turn give you
back the illusion

231
00:09:41,216 --> 00:09:43,086
of that same shopping cart.

232
00:09:43,326 --> 00:09:45,116
So, what should you put
in the shopping cart?

233
00:09:45,266 --> 00:09:49,116
A super global like session
is just an associative array

234
00:09:49,296 --> 00:09:51,136
which means you can put
keys and values in it,

235
00:09:51,366 --> 00:09:52,526
which is actually
pretty versatile.

236
00:09:52,526 --> 00:09:54,276
It means you can put pretty
much anything in it you want.

237
00:09:54,776 --> 00:09:57,686
So, what should you be putting
into the session super global

238
00:09:57,896 --> 00:10:01,226
when a user says, give
me a medium cheese pizza,

239
00:10:01,226 --> 00:10:03,866
submit or whatever your
mechanism is for getting

240
00:10:03,866 --> 00:10:04,876
that input from the user?

241
00:10:05,206 --> 00:10:07,946
What should you put
in the session?

242
00:10:08,046 --> 00:10:08,236
Yeah.

243
00:10:08,236 --> 00:10:10,416
>> Size, item, price.

244
00:10:10,586 --> 00:10:12,416
>> Good, so size, item
and what the last?

245
00:10:12,416 --> 00:10:12,556
>> Price.

246
00:10:12,976 --> 00:10:14,306
>> Price. Okay, and
what's you name again?

247
00:10:14,306 --> 00:10:14,546
>> Ben.

248
00:10:14,696 --> 00:10:17,006
>> Ben. Okay, so size,
item, price are candidates.

249
00:10:17,006 --> 00:10:17,306
What else?

250
00:10:17,306 --> 00:10:21,416
>> Maybe the XML path to
that particular element.

251
00:10:21,416 --> 00:10:23,836
>> Interesting, so maybe the
XML path to that element.

252
00:10:23,836 --> 00:10:25,446
Why would you want to
store that perhaps?

253
00:10:25,726 --> 00:10:28,946
>> Because on my checkout
page maybe I actually want

254
00:10:34,156 --> 00:10:38,946
to show the item and the name
and everything in it instead

255
00:10:39,276 --> 00:10:44,996
of just storing everything
inside the session which kind

256
00:10:45,336 --> 00:10:46,726
of takes up a lot of space.

257
00:10:46,726 --> 00:10:46,896
>> Okay.

258
00:10:47,136 --> 00:10:48,816
>> With a long name
and a description.

259
00:10:48,816 --> 00:10:50,956
I could just tell my checkout
page where to find it.

260
00:10:50,956 --> 00:10:53,716
>> Okay. Interesting, so
storing this location path,

261
00:10:53,716 --> 00:10:56,976
an XPath expression
essentially would enable you

262
00:10:56,976 --> 00:11:01,036
to query the XML file later to
get back that same elements,

263
00:11:01,216 --> 00:11:05,616
the upside of which is you're
not storing all these pieces

264
00:11:05,616 --> 00:11:07,836
of data like price and
description and what not,

265
00:11:08,086 --> 00:11:09,946
things that you would
want the user to see

266
00:11:09,946 --> 00:11:11,706
in their shopping cart, in their
checkout page for instance,

267
00:11:11,706 --> 00:11:13,766
so there's that optimization
and at the same time,

268
00:11:13,766 --> 00:11:15,206
the location path
is fairly long,

269
00:11:15,206 --> 00:11:18,106
so it's not necessarily
superefficient and what

270
00:11:18,106 --> 00:11:22,896
if for instance, and this
is admittedly a corner case

271
00:11:22,946 --> 00:11:25,926
and it's not something it's
something you can consciously

272
00:11:25,956 --> 00:11:27,246
choose to ignore, but you
should at least trip over it,

273
00:11:27,276 --> 00:11:28,326
if you did store
prices as was it Ben?

274
00:11:28,356 --> 00:11:28,423
>> Yeah.

275
00:11:28,446 --> 00:11:29,316
>> Ben proposed,
what's a downside there?

276
00:11:29,346 --> 00:11:31,056
Find a bug even it's an obscure
corner case in storing prices

277
00:11:31,086 --> 00:11:32,316
and names and descriptions
in the shopping cart?

278
00:11:32,346 --> 00:11:32,413
Yeah.

279
00:11:32,436 --> 00:11:32,766
>> The dollar sign.

280
00:11:32,796 --> 00:11:32,976
>> Okay, so.

281
00:11:33,016 --> 00:11:34,046
[ Inaudible ]

282
00:11:34,046 --> 00:11:34,416
Oh, interesting.

283
00:11:34,416 --> 00:11:36,836
Okay, so if we're using US
dollars we have the dollar sign

284
00:11:36,946 --> 00:11:40,616
symbol which usually
denotes a variable.

285
00:11:40,616 --> 00:11:42,936
Now, we can work around that by
using for instance single quotes

286
00:11:42,936 --> 00:11:44,536
and so forth, but potential bug.

287
00:11:44,536 --> 00:11:44,646
Jack.

288
00:11:44,646 --> 00:11:49,516
>> What if someone can go in
and change their own session

289
00:11:49,516 --> 00:11:54,906
like make negative money so that
they are charging to yourself?

290
00:11:54,956 --> 00:11:56,936
>> So, what if a user
make negative money

291
00:11:56,936 --> 00:11:59,306
by mutating the session object?

292
00:11:59,526 --> 00:12:00,496
So, that's a good thought.

293
00:12:00,606 --> 00:12:03,416
In this case because the
session is stored server side,

294
00:12:03,836 --> 00:12:06,016
they couldn't change
the contents of it,

295
00:12:06,506 --> 00:12:11,006
they could spoof someone else's
cookie by sniffing it wirelessly

296
00:12:11,006 --> 00:12:13,286
and then seeing someone
else's shopping cart,

297
00:12:13,516 --> 00:12:15,396
but even in that scenario
they're going to end

298
00:12:15,396 --> 00:12:17,026
up ordering the wrong
food or they're going

299
00:12:17,026 --> 00:12:19,386
to be buying something,
yeah, someone else put

300
00:12:19,386 --> 00:12:20,206
in their shopping cart.

301
00:12:20,466 --> 00:12:22,266
So, a good thought but in
this case I think we're safe

302
00:12:22,266 --> 00:12:23,246
from that particular attack.

303
00:12:23,886 --> 00:12:25,226
Okay, Axel.

304
00:12:25,226 --> 00:12:29,116
>> Yeah, what if a
user orders a pizza

305
00:12:29,116 --> 00:12:32,306
and then you store the
price in the session,

306
00:12:32,636 --> 00:12:34,306
but then the price is change?

307
00:12:34,336 --> 00:12:34,706
>> Yeah.

308
00:12:34,706 --> 00:12:37,026
>> For whatever reason
and then the guy

309
00:12:37,026 --> 00:12:39,436
who actually buys the
pizza pays the wrong price,

310
00:12:39,436 --> 00:12:40,886
a higher price-lower price?

311
00:12:41,186 --> 00:12:43,386
>> Good. So, this is a
legitimate corner case

312
00:12:43,386 --> 00:12:44,676
and maybe it's not
a big deal cause

313
00:12:44,676 --> 00:12:45,936
if you're changing your
price what are you going to,

314
00:12:45,936 --> 00:12:47,476
you're either going to
gain or lose a few cents,

315
00:12:47,476 --> 00:12:50,116
maybe not a big deal,
but in principle here

316
00:12:50,356 --> 00:12:54,516
if someone knew your prices
were going up and they know

317
00:12:54,516 --> 00:12:58,406
that the items are
stored in the server side,

318
00:12:58,606 --> 00:13:00,676
the prices are stored in
the server side session,

319
00:13:01,056 --> 00:13:03,596
if they just keep around that
cookie that gives then access

320
00:13:03,636 --> 00:13:06,326
to that shopping cart and they
maybe pre-create a whole bunch

321
00:13:06,326 --> 00:13:08,476
of shopping carts using
different browsers or what not,

322
00:13:08,476 --> 00:13:11,326
again obscure corner
case, they will get access

323
00:13:11,326 --> 00:13:13,736
to the old price whether
it's lower or higher,

324
00:13:13,736 --> 00:13:16,476
so not necessarily bad but
it's a little weird, right,

325
00:13:16,476 --> 00:13:19,956
especially if your database is
logging the prices people paid

326
00:13:19,956 --> 00:13:22,126
and even though there's not
a database in this scenario,

327
00:13:22,286 --> 00:13:24,586
now all of a sudden even
though your prices changed

328
00:13:24,586 --> 00:13:27,266
with the new fiscal
year, July 1st of 2012,

329
00:13:27,546 --> 00:13:29,976
somehow your customers are
still paying the old prices,

330
00:13:29,976 --> 00:13:31,536
so in short, it's
just a little weird,

331
00:13:31,536 --> 00:13:34,306
it's a little inconsistent and
it's also just not necessary.

332
00:13:34,336 --> 00:13:36,966
What would be an alternative
to storing these various pieces

333
00:13:36,966 --> 00:13:40,056
of metadata or even
storing something as long

334
00:13:40,056 --> 00:13:43,226
and as string-like as
an XPath expression?

335
00:13:43,466 --> 00:13:44,916
What else could we put
in the shopping cart,

336
00:13:45,116 --> 00:13:48,836
yet still enable this guy
to sell pizzas and more?

337
00:13:50,026 --> 00:13:50,126
Yeah.

338
00:13:50,756 --> 00:13:55,586
>> You could implement
and identifier.

339
00:13:55,586 --> 00:13:55,806
>> Okay.

340
00:13:55,806 --> 00:13:59,836
>> Of each item so they
would have like a number

341
00:13:59,836 --> 00:14:01,056
or a sequence of letters.

342
00:14:01,056 --> 00:14:01,456
>> Okay, good.

343
00:14:01,456 --> 00:14:03,436
>> And then you probably
find that

344
00:14:03,436 --> 00:14:04,956
and essentially find everything.

345
00:14:05,096 --> 00:14:05,696
>> Okay, good.

346
00:14:05,696 --> 00:14:08,846
So, what if we instead gave
everything a unique identifier?

347
00:14:08,846 --> 00:14:12,816
Maybe it's a number like
this is item number 1, 2, 3.

348
00:14:13,016 --> 00:14:16,906
Maybe it's instead and
alphanumeric string

349
00:14:16,906 --> 00:14:21,106
like this is pizza underscore S
for small, underscore anchovies

350
00:14:21,106 --> 00:14:21,996
or something like that.

351
00:14:22,236 --> 00:14:23,966
It could go into a bunch
of different directions

352
00:14:23,966 --> 00:14:25,336
but just having this principle

353
00:14:25,336 --> 00:14:27,626
of a unique identifier
it's pretty compelling.

354
00:14:27,986 --> 00:14:31,136
Now, push back on that, so have
an argument among ourselves

355
00:14:31,136 --> 00:14:32,816
here, what's bad
about that design?

356
00:14:32,816 --> 00:14:32,906
Jack.

357
00:14:32,906 --> 00:14:34,006
>> It's the person
who's editing and going

358
00:14:34,006 --> 00:14:35,916
through all these unique
identifiers and decides

359
00:14:35,916 --> 00:14:42,036
that they want to change certain
identifiers or that they need

360
00:14:42,086 --> 00:14:43,276
to add something
somewhere it wasn't before.

361
00:14:43,276 --> 00:14:43,456
>> Ah hum.

362
00:14:43,456 --> 00:14:45,746
>> So, it's kind of messy
for the person who has

363
00:14:45,746 --> 00:14:48,336
to implement new items.

364
00:14:48,336 --> 00:14:48,826
>> Okay, good.

365
00:14:48,826 --> 00:14:51,046
So, what if the pizzeria
guy in short needs

366
00:14:51,046 --> 00:14:54,126
to change things later, update
things, you're assuming a lot

367
00:14:54,126 --> 00:14:55,796
of sophistication
now from this guy

368
00:14:55,796 --> 00:14:58,476
who told you build him a site
that allows him to edit it

369
00:14:58,476 --> 00:15:01,746
with TextEdit or Notepad.exe
and if you now have to put

370
00:15:01,746 --> 00:15:04,176
in your instructions to this
fellow, well you now have

371
00:15:04,176 --> 00:15:06,496
to make sure you choose a
unique identifier for every one

372
00:15:06,496 --> 00:15:08,116
of your products, you know,

373
00:15:08,116 --> 00:15:11,276
the questions going be frankly
what's a unique identifier

374
00:15:11,276 --> 00:15:12,876
which, you know, even
though that's, you know,

375
00:15:12,876 --> 00:15:16,186
says what it means to a layman
that's not necessarily the

376
00:15:16,186 --> 00:15:18,526
clearest statement so
this is also just assuming

377
00:15:18,826 --> 00:15:21,786
that the human involved is going
to care about these details too

378
00:15:22,026 --> 00:15:24,446
and frankly a pizza guy and
even a technical guy should not,

379
00:15:24,666 --> 00:15:27,176
they should for the record, they
might be in the mutual exclusive

380
00:15:27,176 --> 00:15:30,686
in this world, they it's just
not a detail you should have

381
00:15:30,746 --> 00:15:31,536
to care about, right?

382
00:15:31,536 --> 00:15:32,856
Why do I the human have to come

383
00:15:32,856 --> 00:15:34,186
up with these unique
identifiers?

384
00:15:34,396 --> 00:15:38,836
So, in short where we get
some nice flexibility of XML

385
00:15:38,836 --> 00:15:41,256
and it's very low-impacted,
we don't have to know anything

386
00:15:41,256 --> 00:15:43,206
about databases, we don't
need a database server.

387
00:15:43,206 --> 00:15:46,176
We can run this on a PC under
the desk at the pizza shop

388
00:15:46,176 --> 00:15:47,916
so long as it has a
public IP address.

389
00:15:48,106 --> 00:15:50,586
It's pretty simple, but
we're paying a prince for it

390
00:15:50,586 --> 00:15:53,666
and even now, I don't
think we've reached closure

391
00:15:53,786 --> 00:15:56,636
on what the best design is here,
but again, these are the kinds

392
00:15:56,636 --> 00:15:59,226
of things you should be
struggling with or tripping

393
00:15:59,226 --> 00:16:01,826
over so that at least when
you've designed your file even

394
00:16:01,826 --> 00:16:04,786
if you know this is not
perfect this world of XML,

395
00:16:04,926 --> 00:16:07,516
at least you've made a conscious
design choice and as we've said,

396
00:16:07,756 --> 00:16:12,176
as I think we've said earlier
any time you have given some

397
00:16:12,176 --> 00:16:15,326
thought to a design decision
like this in code or in XML

398
00:16:15,326 --> 00:16:19,196
or in HTML even, and you think
hum you know a sharp teaching

399
00:16:19,196 --> 00:16:21,396
fellow might think that I'm
an idiot for having done this

400
00:16:21,396 --> 00:16:24,046
but I really did give it some
thought just comment it, right?

401
00:16:24,046 --> 00:16:26,606
Put PHP comments, XML
comments, to just convey

402
00:16:26,606 --> 00:16:29,406
to us why you did what you
did and even if we disagree

403
00:16:29,406 --> 00:16:31,656
at least we know you
too are a rational being

404
00:16:31,656 --> 00:16:34,216
who gave this some thought,
so that's as important

405
00:16:34,216 --> 00:16:35,056
as the decision itself.

406
00:16:36,236 --> 00:16:38,056
Alright, so any questions
then on XML,

407
00:16:38,056 --> 00:16:41,226
XPath, pizza or the like?

408
00:16:41,706 --> 00:16:45,646
Alright, so let's
try to solve some

409
00:16:45,646 --> 00:16:48,666
of these problems
then today, databases

410
00:16:48,666 --> 00:16:49,856
and SEQUELs specifically.

411
00:16:49,906 --> 00:16:51,686
What is a database
in layman's terms?

412
00:16:51,686 --> 00:16:51,836
Axel.

413
00:16:51,836 --> 00:16:58,376
>> You can call it big
table with rows and columns.

414
00:16:58,376 --> 00:17:00,336
>> Good. So, big table
with rows and columns

415
00:17:00,336 --> 00:17:02,786
or at least that's one type of
database that's been popular

416
00:17:02,786 --> 00:17:04,186
for some time, there's
alternatives

417
00:17:04,186 --> 00:17:07,886
that we'll touch upon, but a
table with rows and columns.

418
00:17:07,886 --> 00:17:09,766
So, you might think
of just frankly Excel

419
00:17:09,766 --> 00:17:11,976
or Apple numbers any kind

420
00:17:11,976 --> 00:17:13,876
of spreadsheet program is
effectively giving you a

421
00:17:13,876 --> 00:17:16,196
database and in fact if
you've used Microsoft Access,

422
00:17:16,196 --> 00:17:19,396
I mean that even looks like a
spreadsheet at least the UI part

423
00:17:19,396 --> 00:17:22,526
of it and it allows you to
store data in rows and columns.

424
00:17:22,776 --> 00:17:25,416
So, what kinds of things
might you store in a database?

425
00:17:25,416 --> 00:17:28,906
Well, what about
orders from a pizzeria?

426
00:17:28,906 --> 00:17:31,846
Suppose we did go this route
of having a unique identifier

427
00:17:31,846 --> 00:17:34,646
for pizzas, pizza underscore
S underscore anchovies,

428
00:17:34,936 --> 00:17:38,616
or the location path, or
a combination of fields

429
00:17:38,616 --> 00:17:42,466
like Ben proposed "pizza",
and "S", and "anchovies",

430
00:17:42,466 --> 00:17:45,586
or the price and whatever your
approach, suppose that one

431
00:17:45,586 --> 00:17:48,586
of our columns represented
that, whatever we're using

432
00:17:48,586 --> 00:17:50,696
to identify the thing
in the shopping cart

433
00:17:51,116 --> 00:17:53,846
and in the other columns
what might we have?

434
00:17:54,086 --> 00:17:57,196
What are the other pieces of
information you probably need

435
00:17:57,196 --> 00:17:59,836
to remember when someone
is buying something

436
00:17:59,836 --> 00:18:01,736
from a pizza site
or any site really?

437
00:18:02,366 --> 00:18:02,456
Yeah.

438
00:18:03,476 --> 00:18:06,136
>> You would do the price, so
if you were in the database

439
00:18:06,136 --> 00:18:09,376
with an identifier you
could get that price.

440
00:18:09,376 --> 00:18:09,606
>> Okay.

441
00:18:09,606 --> 00:18:13,146
>> Maybe the name, maybe
availability if it's.

442
00:18:13,356 --> 00:18:16,186
>> Great. Yeah, any of this
metadata, again quantity

443
00:18:16,186 --> 00:18:20,576
of pizzas bought or available,
how many of these things are

444
00:18:20,576 --> 00:18:22,266
in stock, how much dough
do you have for instance,

445
00:18:22,386 --> 00:18:25,186
what is the unit price or
in the case of an order

446
00:18:25,186 --> 00:18:27,896
that user is placed how
many pizzas did they buy,

447
00:18:28,176 --> 00:18:30,106
so in short any of the
data you might remember

448
00:18:30,416 --> 00:18:32,276
that you're throwing
away right now

449
00:18:32,276 --> 00:18:33,816
for pizza ML cause
we don't require

450
00:18:33,816 --> 00:18:36,446
that you have a database, but we
do require that you allow these

451
00:18:36,446 --> 00:18:37,806
or to pretend to checkout.

452
00:18:38,036 --> 00:18:39,716
Once you have a database
you can also write

453
00:18:40,126 --> 00:18:42,436
to this storage mechanism
and not just read

454
00:18:42,436 --> 00:18:44,156
as you are from the XML file.

455
00:18:44,526 --> 00:18:46,306
Oh, and just as an
aside so it's clear,

456
00:18:46,646 --> 00:18:49,066
one of the key motivations
for XPath realize is

457
00:18:49,066 --> 00:18:51,346
that if you do take the
approach that Axel proposed

458
00:18:51,556 --> 00:18:57,136
or Ben proposed or this approach
of a unique identifier realize

459
00:18:57,136 --> 00:19:00,116
that if you do have at least
the first and third of those,

460
00:19:00,116 --> 00:19:02,186
if you have a location path

461
00:19:02,186 --> 00:19:06,386
or a unique identifier you can
use XPath as we did to home

462
00:19:06,386 --> 00:19:09,816
in on the specific element
in the XML file that you want

463
00:19:09,816 --> 00:19:11,806
to get back because you
want to check it's price,

464
00:19:11,806 --> 00:19:13,646
you want to check it's
availability and so forth,

465
00:19:13,896 --> 00:19:15,866
so realize that the
XPath function

466
00:19:15,896 --> 00:19:19,436
in PHP even though it will
return by definition an array

467
00:19:19,506 --> 00:19:21,616
of matching elements, realize

468
00:19:21,616 --> 00:19:24,096
that array could just be a
size one which it will be

469
00:19:24,096 --> 00:19:26,016
if you've uniquely
identified a node.

470
00:19:26,306 --> 00:19:29,336
So, you can just grab bracket
zero for instance of that array

471
00:19:29,336 --> 00:19:31,306
to get back the thing
that you queried for,

472
00:19:31,306 --> 00:19:33,686
so that's where XPath's
power comes from so

473
00:19:33,686 --> 00:19:36,676
that you the developer, don't
have to write a bunch of nested

474
00:19:36,736 --> 00:19:39,786
for loops iterating over every
possible element in the tree,

475
00:19:39,786 --> 00:19:41,866
looking is this one, is this
the one, is this the one,

476
00:19:41,866 --> 00:19:44,946
is the one with Ifs and ELSEs
and so forth, you can do it

477
00:19:44,946 --> 00:19:48,346
with XPath, so realize you have
that expressive capability.

478
00:19:48,946 --> 00:19:52,336
Alright, so what forms
do databases come in?

479
00:19:52,336 --> 00:19:56,066
Well, what's CSV?

480
00:19:56,276 --> 00:19:57,276
What's it stand for even?

481
00:19:57,806 --> 00:19:58,726
Yeah, Ben.

482
00:19:58,836 --> 00:19:59,976
>> Comma-separated values.

483
00:20:00,076 --> 00:20:01,956
>> Yeah, comma-separated values.

484
00:20:02,216 --> 00:20:05,426
So, this is really like a
quick and dirty style database,

485
00:20:05,426 --> 00:20:09,046
so much like a spreadsheet has
rows and columns, you can kind

486
00:20:09,046 --> 00:20:10,816
of mimic the idea
of rows and columns

487
00:20:10,906 --> 00:20:14,156
in a text file whereby every
time you hit enter you get a new

488
00:20:14,156 --> 00:20:15,746
row, that's pretty trivial

489
00:20:15,946 --> 00:20:19,456
and every time you put a comma
you get essentially a different

490
00:20:19,506 --> 00:20:23,526
column, so if your first
row has three commas

491
00:20:23,836 --> 00:20:28,466
and therefore four fields,
so field, field, field,

492
00:20:28,576 --> 00:20:31,746
field and the next row
also has three commas

493
00:20:31,746 --> 00:20:34,936
and the next one has three
commas even though the spacing

494
00:20:34,936 --> 00:20:38,236
might not line up perfectly
you will have effectively four

495
00:20:38,236 --> 00:20:39,976
columns in the CSV file.

496
00:20:39,976 --> 00:20:40,986
So, you have the beginnings

497
00:20:40,986 --> 00:20:43,196
of a database not
unlike a spreadsheet,

498
00:20:43,636 --> 00:20:44,916
but there's some problems.

499
00:20:45,646 --> 00:20:47,826
If you're using commas
to separate your

500
00:20:47,876 --> 00:20:52,016
"columns" what's perhaps
the most glaring deficiency

501
00:20:52,016 --> 00:20:52,976
of this representation?

502
00:20:53,576 --> 00:20:54,976
Yeah, Conner.

503
00:20:55,016 --> 00:20:56,346
[ Inaudible Response ]

504
00:20:56,346 --> 00:20:59,556
Right. You're kind of screwed is
your words have commas in them

505
00:20:59,556 --> 00:21:01,276
because now it's
going to be ambiguous

506
00:21:01,506 --> 00:21:03,426
to the program reading this CSV

507
00:21:03,426 --> 00:21:06,836
or even the human reading
this CSV file what the comma

508
00:21:06,836 --> 00:21:08,816
represents, is it
a comma for grammar

509
00:21:08,816 --> 00:21:11,086
or is it a comma for
field separation?

510
00:21:11,306 --> 00:21:12,436
So, it wasn't so long ago

511
00:21:12,436 --> 00:21:14,286
that people found a
workaround for this.

512
00:21:14,286 --> 00:21:14,836
They have PSVs.

513
00:21:14,836 --> 00:21:16,956
Anyone know what a PSV is?

514
00:21:17,466 --> 00:21:18,766
>> Period-separation.

515
00:21:18,896 --> 00:21:19,886
>> Period, so not a bad guess.

516
00:21:19,886 --> 00:21:22,626
It's not period but it's
a pipe-separated values.

517
00:21:22,626 --> 00:21:25,316
It's that vertical bar that you
get by holding shift usually

518
00:21:25,316 --> 00:21:26,626
and hitting a key
on the keyboard.

519
00:21:26,856 --> 00:21:28,746
So, why did the world
introduce PSVs?

520
00:21:28,746 --> 00:21:30,646
Well, frankly to
address this problem.

521
00:21:30,826 --> 00:21:32,866
Now, of course there's
a corner case here,

522
00:21:32,866 --> 00:21:33,886
what's the problem with PSVs?

523
00:21:33,886 --> 00:21:34,006
Yeah.

524
00:21:34,006 --> 00:21:36,806
>> If anyone ever
needs to use one those.

525
00:21:36,936 --> 00:21:37,476
>> Exactly.

526
00:21:37,476 --> 00:21:40,226
If you ever need a vertical
bar which is less common

527
00:21:40,226 --> 00:21:42,616
in fairness, but possible
especially if it's right there

528
00:21:42,616 --> 00:21:44,326
on the keyboard you're
screwed again.

529
00:21:44,456 --> 00:21:46,716
Okay, so then the world
came up with TSVs.

530
00:21:46,716 --> 00:21:47,826
What are TSVs?

531
00:21:49,116 --> 00:21:49,266
>> Tilde.

532
00:21:49,436 --> 00:21:51,646
>> Tilde, no but good guess,
I mean frankly we can make

533
00:21:51,646 --> 00:21:53,046
up our own file formats
here pretty easily.

534
00:21:53,276 --> 00:21:58,126
It's tab-separated values,
so that is good in that

535
00:21:58,126 --> 00:22:00,546
at least you're not likely
to have a tab in the middle

536
00:22:00,546 --> 00:22:01,756
of the sentence, but what

537
00:22:01,756 --> 00:22:03,666
if you're actually
storing paragraphs or what

538
00:22:03,666 --> 00:22:06,646
if you actually are using
text editor as some of you

539
00:22:06,646 --> 00:22:07,656
for programming might use

540
00:22:07,826 --> 00:22:10,036
that doesn't really
support tabs per se cause

541
00:22:10,036 --> 00:22:13,066
if they automatically converted
to spaces, that system kind

542
00:22:13,066 --> 00:22:14,176
of breaks down, so tabs

543
00:22:14,176 --> 00:22:17,146
in general is a very fragile
mechanism in programming

544
00:22:17,146 --> 00:22:19,566
in general, so that's
not so hot either,

545
00:22:19,566 --> 00:22:20,876
so there are all
these various formats.

546
00:22:21,096 --> 00:22:23,556
There are some workarounds, what
you can actually do with any

547
00:22:23,556 --> 00:22:25,126
of these formats commas, pipes,

548
00:22:25,176 --> 00:22:28,576
tabs is you simply add some
quotes to the situation

549
00:22:28,836 --> 00:22:31,826
so you quote the entire
columns value so even

550
00:22:31,826 --> 00:22:35,046
if that value has a comma in
it for grammatical purposes,

551
00:22:35,336 --> 00:22:37,926
you look to the quotes
to say what is the string

552
00:22:37,926 --> 00:22:40,206
in this column so to speak.

553
00:22:40,396 --> 00:22:41,946
Now, of course what's
the pushback there?

554
00:22:41,946 --> 00:22:44,056
Well, what if you want to
have a quote in your quote,

555
00:22:44,236 --> 00:22:46,786
well then you have to
introduce backslash quote marks

556
00:22:46,786 --> 00:22:48,696
so we a solution there,
but what if you want

557
00:22:48,696 --> 00:22:50,776
to have backslash quote
marks literally in your,

558
00:22:50,776 --> 00:22:52,766
well then you do
backslash, backslash quote,

559
00:22:53,326 --> 00:22:55,926
but that might be ridiculous but
that's the world we're already

560
00:22:56,056 --> 00:22:57,196
in with programming itself,

561
00:22:57,406 --> 00:22:59,466
so just realize there
are these various issues.

562
00:22:59,846 --> 00:23:02,686
But more importantly, what's
nice about CSVs is one,

563
00:23:03,116 --> 00:23:04,566
they're completely
language independent.

564
00:23:04,566 --> 00:23:07,386
It doesn't matter if you're
Java programmer, PHP programmer,

565
00:23:07,386 --> 00:23:09,226
or whatever these are compatible

566
00:23:09,226 --> 00:23:12,086
with any language cause
it's just silly text.

567
00:23:12,556 --> 00:23:15,676
You have to write a program that
reads it, but once you do anyone

568
00:23:15,676 --> 00:23:18,266
in the world in theory could
use that library and voila,

569
00:23:18,266 --> 00:23:21,846
you now have CSV support indeed
in PHP they wrote it for us.

570
00:23:21,846 --> 00:23:25,556
There's a couple of functions,
F get CSV and F put CSV

571
00:23:25,636 --> 00:23:28,276
that does all of the
annoying ParSet for you,

572
00:23:28,486 --> 00:23:31,786
so with F get CSV if you
look up its documentation

573
00:23:31,786 --> 00:23:34,696
on PHP.net you'll see
that it takes an argument

574
00:23:34,696 --> 00:23:37,906
which for instance is
a reference to a file,

575
00:23:38,536 --> 00:23:42,396
food.csv on the hard drive
and it will ParSet for you top

576
00:23:42,396 --> 00:23:44,566
to bottom, left to right
and what it will return

577
00:23:44,696 --> 00:23:49,786
to you is an array of all of
the rows and then you can index

578
00:23:49,786 --> 00:23:52,626
into those rows because
they themselves are arrays.

579
00:23:52,956 --> 00:23:55,066
So, it returns an
array of arrays

580
00:23:55,486 --> 00:23:57,186
where the big array
represents all your rows

581
00:23:57,186 --> 00:24:00,516
and each individual array
represents all of the cells

582
00:24:00,516 --> 00:24:02,706
or the columns in that
row, so it's nice.

583
00:24:02,736 --> 00:24:04,986
It just deals with all the
stupid space issues, the pipes,

584
00:24:05,066 --> 00:24:06,906
the CS, the commas, the tabs,

585
00:24:07,156 --> 00:24:09,236
and in fact even though
it's called F get CSV,

586
00:24:09,236 --> 00:24:12,036
you can override the
default delimiter

587
00:24:12,156 --> 00:24:16,176
and tell it don't use comma,
use tab, use pipe, use tilde,

588
00:24:16,176 --> 00:24:19,036
use period, whatever you
want it to be, so it's nice

589
00:24:19,036 --> 00:24:21,566
in that regard and more
importantly if there's F put CSV

590
00:24:21,566 --> 00:24:23,976
which is compelling cause
it does the opposite,

591
00:24:23,976 --> 00:24:27,706
it writes CSV files and it deals
with the headache of figuring

592
00:24:27,706 --> 00:24:28,776
out what needs to be quoted,

593
00:24:28,776 --> 00:24:30,926
what needs to be
backslashed escaped

594
00:24:31,196 --> 00:24:32,576
and so forth, so really handy.

595
00:24:33,256 --> 00:24:34,686
Alright. Then there's XML,

596
00:24:34,926 --> 00:24:38,866
SimpleXML API actually has some
faults it's sometimes too simple

597
00:24:39,116 --> 00:24:41,666
and for instance case in point
the X file function always

598
00:24:41,666 --> 00:24:44,046
returns an array even if
you know there's only going

599
00:24:44,046 --> 00:24:47,136
to be one node in the tree that
has this unique identifier,

600
00:24:47,416 --> 00:24:49,966
so occasionally you'll run into
just some, hum it's not quite

601
00:24:49,966 --> 00:24:52,606
as user friendly as you like,
but it's definitely simple.

602
00:24:52,606 --> 00:24:56,716
An alternative to it is the
DOM, document object model API

603
00:24:56,856 --> 00:24:59,876
and PHP which you're welcome
to use if you really want.

604
00:24:59,876 --> 00:25:01,186
It's a little more sophisticated

605
00:25:01,186 --> 00:25:04,346
and it's not necessary
technologically for pizza ML,

606
00:25:04,506 --> 00:25:06,776
but realize there are
other parsers so to speak.

607
00:25:06,776 --> 00:25:09,706
A parser is a program that
reads files and does something

608
00:25:10,006 --> 00:25:12,356
with them, and then
we have MySQL.

609
00:25:12,886 --> 00:25:13,796
What is MySQL?

610
00:25:14,536 --> 00:25:14,616
Yeah.

611
00:25:17,216 --> 00:25:18,396
>> A query language.

612
00:25:18,926 --> 00:25:21,786
>> It's actually not
a query language,

613
00:25:21,786 --> 00:25:24,276
SEQUEL is a query
language, but what's MySQL?

614
00:25:24,556 --> 00:25:25,856
>> That's the type of database.

615
00:25:26,166 --> 00:25:28,646
>> Yeah, so it's a specific
make and model of database,

616
00:25:28,646 --> 00:25:31,516
so it's a vendor
called MySQL that sells

617
00:25:31,516 --> 00:25:32,456
and also makes available

618
00:25:32,456 --> 00:25:36,766
for free a database server
called MySQL, so it's software

619
00:25:36,766 --> 00:25:39,726
that you can download from Mac
OS, Windows, Linux and the like.

620
00:25:40,086 --> 00:25:43,646
You install it on your physical
computer or your physical server

621
00:25:43,866 --> 00:25:48,606
and it is just another service
running on a port listening

622
00:25:48,606 --> 00:25:52,896
in this case to TCP port 3306,
sort of slightly random trivia

623
00:25:52,896 --> 00:25:55,536
that you only rarely need
to know, but it's just

624
00:25:55,586 --> 00:25:57,996
in that regard like a web
server like an email server,

625
00:25:57,996 --> 00:26:00,786
it's listening on a port like
we discussed in lecture zero.

626
00:26:01,126 --> 00:26:05,596
So, MySQL, MySQL PDO we'll
talk about at least a couple

627
00:26:05,596 --> 00:26:07,896
of these tonight and on
Monday but there are,

628
00:26:07,896 --> 00:26:11,346
there's built-in support in
PHP from MySQL which means just

629
00:26:11,346 --> 00:26:13,996
like you have these built-in
functions for CSVs, and TSVs,

630
00:26:13,996 --> 00:26:18,606
and PSVs, and XML, similarly do
get a lot of free functionality

631
00:26:18,606 --> 00:26:22,506
with PHP itself for
interacting with a database

632
00:26:22,506 --> 00:26:24,246
and what is this database MySQL?

633
00:26:24,446 --> 00:26:27,326
It's essentially a super fancy
version of rows and columns

634
00:26:27,326 --> 00:26:29,646
but with multiple tables,
multiple spreadsheets

635
00:26:29,646 --> 00:26:31,736
if you will so you store
lots and lots of data

636
00:26:31,736 --> 00:26:33,466
as we'll start to tonight.

637
00:26:34,556 --> 00:26:38,736
So, SQLite anyone
know what this is?

638
00:26:38,856 --> 00:26:41,016
It's yet another
alternative for a database.

639
00:26:41,506 --> 00:26:43,346
So, SQLite allows you

640
00:26:43,346 --> 00:26:45,696
to use SEQUEL the structured
query language we'll start

641
00:26:45,696 --> 00:26:48,136
looking at tonight in
lecture and in section

642
00:26:48,526 --> 00:26:51,646
and it lets you use
the language,

643
00:26:51,756 --> 00:26:54,456
but without needing a
database server per se,

644
00:26:54,456 --> 00:26:58,056
a SQLite database is
literally just a binary file,

645
00:26:58,056 --> 00:27:01,306
zeros and ones on disc
usually called something dot db

646
00:27:01,376 --> 00:27:05,146
for database and it stores
rows and columns; how?

647
00:27:05,146 --> 00:27:06,906
You don't have to care
about or know about

648
00:27:07,126 --> 00:27:08,626
but it creates the
illusion of rows

649
00:27:08,626 --> 00:27:11,396
and columns even though it
itself is just one big file,

650
00:27:11,716 --> 00:27:13,416
so it allows you to use SEQUEL

651
00:27:13,416 --> 00:27:15,186
as we'll see it's quite
powerful, flexible,

652
00:27:15,186 --> 00:27:18,006
way more user, well its
sort of user friendly,

653
00:27:18,006 --> 00:27:21,836
way more expressive than CSVs
and even XML files especially

654
00:27:21,836 --> 00:27:24,616
when it comes to searching
and such, but you don't need

655
00:27:24,616 --> 00:27:27,116
to figure out how to turn on
a database or run a database

656
00:27:27,116 --> 00:27:29,216
or use the RAM for a
database if it's a quick

657
00:27:29,216 --> 00:27:33,306
and dirty application or a small
application with tens of users;

658
00:27:33,716 --> 00:27:36,486
this means you can package
it all up in one folder

659
00:27:36,486 --> 00:27:38,846
and you don't need any
infrastructure beyond that,

660
00:27:38,846 --> 00:27:40,966
so some of the projects
that I've ran, I just reach

661
00:27:40,966 --> 00:27:42,706
for SQLite where I want
to use SEQUEL cause

662
00:27:42,706 --> 00:27:44,366
as you'll see it's a
very powerful language

663
00:27:44,596 --> 00:27:47,166
but I don't want to setup a
database, setup a user name

664
00:27:47,166 --> 00:27:50,036
and a password when I give the
code someone else then they have

665
00:27:50,036 --> 00:27:52,166
to setup a database,
import that the database,

666
00:27:52,366 --> 00:27:54,576
this way you literally can send
them a Zip file with everything

667
00:27:54,576 --> 00:27:58,316
in it so long as their server
supports whatever language the

668
00:27:58,316 --> 00:28:01,676
project was written in whether
its PHP or something else.

669
00:28:01,956 --> 00:28:07,556
Alright. So, rows looks
familiar, here's Microsoft Excel

670
00:28:07,556 --> 00:28:09,186
and just to paint a
very concrete picture,

671
00:28:09,236 --> 00:28:11,356
rows and columns we're sort
of talking about building

672
00:28:11,356 --> 00:28:12,836
with something like
that in memory

673
00:28:13,046 --> 00:28:15,556
and we can do it the old school
way using a terminal window,

674
00:28:15,556 --> 00:28:18,636
so you might have dived in to
Project Zero already with SSH

675
00:28:18,636 --> 00:28:23,326
or with opening a terminal
window inside of the appliance.

676
00:28:23,326 --> 00:28:25,556
This gives you a black
and white or white

677
00:28:25,556 --> 00:28:28,546
and black interface not
unlike this and MySQL

678
00:28:28,546 --> 00:28:31,476
because it's a server
that's listening on a port,

679
00:28:31,476 --> 00:28:34,576
you can talk to it just like a
web browser can talk to a server

680
00:28:34,576 --> 00:28:36,366
or an email client
can talk to a server,

681
00:28:36,646 --> 00:28:38,816
so one way in which you
can start interacting

682
00:28:38,816 --> 00:28:41,396
with MySQL is via
this command prompt.

683
00:28:41,396 --> 00:28:43,506
So, let me actually go
over to the appliance here,

684
00:28:43,836 --> 00:28:46,446
let me open up a terminal
window, and what I'm going

685
00:28:46,446 --> 00:28:51,576
to do is, what I'm going
to do here is type MySQL,

686
00:28:51,886 --> 00:28:54,176
enter and we'll see
access denied

687
00:28:54,176 --> 00:28:56,986
and let me make one tweak
here just so that we can zoom

688
00:28:58,706 --> 00:29:08,206
in on this, give me one
second, so I can zoom in for us.

689
00:29:11,696 --> 00:29:12,366
There we go.

690
00:29:12,836 --> 00:29:16,036
Okay, so access denied for
user jharvard@localhost.

691
00:29:16,036 --> 00:29:18,286
Recall that localhost refers
to the current computer.

692
00:29:18,496 --> 00:29:21,446
So I need to type a slightly
more involved command.

693
00:29:21,446 --> 00:29:24,906
MySQL-ujharvard is
fine and then dash p,

694
00:29:25,006 --> 00:29:27,506
enter now I'm being asked for a
password, as you may have seen

695
00:29:27,506 --> 00:29:29,096
in the appliances documentation,

696
00:29:29,296 --> 00:29:32,156
is that John Harvard's password
is always crimson by default

697
00:29:32,286 --> 00:29:35,516
so now we're at the point
of where the slide was

698
00:29:35,516 --> 00:29:38,446
at just a moment ago, so now
I can start typing commands

699
00:29:38,446 --> 00:29:40,216
and this will very
quickly get a bit tedious,

700
00:29:40,216 --> 00:29:45,006
but I can type show databases;
and here the databases that come

701
00:29:45,006 --> 00:29:48,826
by default with a MySQL server,
information schema, MySQL,

702
00:29:48,826 --> 00:29:51,276
performance schema, test; for
the most part you should never,

703
00:29:51,276 --> 00:29:53,646
ever touch these
databases in fact,

704
00:29:53,896 --> 00:29:55,896
we typically can figure
the appliance in such a way

705
00:29:56,156 --> 00:29:58,586
that when you log into the
database server with a GUI

706
00:29:58,586 --> 00:30:01,816
as we'll soon see called
PHPMyadmin we hide these

707
00:30:01,816 --> 00:30:05,066
by default cause generally a
developer should not touch them,

708
00:30:05,066 --> 00:30:06,966
they're used internally by MySQL
with a couple of exceptions.

709
00:30:07,046 --> 00:30:11,366
So, there's nothing
interesting here just yet,

710
00:30:11,586 --> 00:30:13,466
but I could start
typing commands,

711
00:30:13,726 --> 00:30:16,776
like I could start saying if
I need a database for lecture,

712
00:30:16,996 --> 00:30:21,576
I could say create database and
then specify what kinds of rows

713
00:30:21,576 --> 00:30:22,686
and columns do I want.

714
00:30:22,686 --> 00:30:25,996
Do I want numbers, do I want
letters, do I want dates,

715
00:30:25,996 --> 00:30:28,296
do I want times, we're
going to able to specify

716
00:30:28,526 --> 00:30:31,426
with much more precision that
type of data we're storing

717
00:30:31,586 --> 00:30:36,196
where a CSV and XML they're all
strings in that case unless it,

718
00:30:36,196 --> 00:30:37,646
and even though it
might look like int,

719
00:30:37,646 --> 00:30:39,936
it's still "an integer."

720
00:30:40,306 --> 00:30:42,736
So, let's actually
transition from this black

721
00:30:42,736 --> 00:30:43,726
and white environment

722
00:30:43,726 --> 00:30:46,206
which while powerful is
just not very user friendly

723
00:30:46,506 --> 00:30:49,446
and let me open up
something called PHPMyadmin.

724
00:30:49,736 --> 00:30:52,166
So, notice here, I'm back in
my Mac but you can do this

725
00:30:52,166 --> 00:30:54,786
with any operating system,
Windows or Linux or the like,

726
00:30:55,136 --> 00:30:58,716
notice I've gone to http:// even
though the browser is hiding

727
00:30:58,716 --> 00:31:00,836
that, appliance/PHPMyadmin.

728
00:31:01,636 --> 00:31:04,206
Now, this will not work
for you out of the box,

729
00:31:04,246 --> 00:31:06,496
when you've booted up the
appliance you can pull

730
00:31:06,496 --> 00:31:10,516
up this utility called
PHPMyadmin using Chrome inside

731
00:31:10,516 --> 00:31:14,516
of the appliance
by using what URL?

732
00:31:15,176 --> 00:31:18,846
If I open up Chrome inside
of the CSV 50 appliance,

733
00:31:19,656 --> 00:31:24,896
and I type http:// it's kind
spoiled by autocomplete now.

734
00:31:25,856 --> 00:31:26,866
Where can I go to see this?

735
00:31:27,986 --> 00:31:28,266
>> Localhost.

736
00:31:28,266 --> 00:31:28,946
>> Localhost.

737
00:31:28,946 --> 00:31:31,316
When you're inside of
the computer as I am now,

738
00:31:31,316 --> 00:31:33,486
I'm in Linux cause that's
the screen I've pulled up,

739
00:31:33,486 --> 00:31:36,346
I can call up localhost and
I'll get to the same place.

740
00:31:36,346 --> 00:31:38,886
I'm going to type in jharvard
and crimson for my user name

741
00:31:38,886 --> 00:31:42,116
and password and I
see that same UI here.

742
00:31:42,116 --> 00:31:44,016
The browser looks ever so
slightly different in Linux

743
00:31:44,016 --> 00:31:45,486
than in Mac OS, but
it's the same idea.

744
00:31:45,896 --> 00:31:50,266
However, on my Mac I have
instead appliance/localhost.

745
00:31:50,566 --> 00:31:52,746
How did I make that possible?

746
00:31:54,226 --> 00:31:54,896
Yeah, Axel.

747
00:31:55,126 --> 00:31:57,016
>> Probably at c host.

748
00:31:57,016 --> 00:32:00,476
>> Yeah, at c host, so recall
that there's this trick

749
00:32:00,476 --> 00:32:03,506
on Mac OS and Linux and also
on Windows where there's a file

750
00:32:03,796 --> 00:32:07,356
on the system and the project
spec tells you how you can edit

751
00:32:07,356 --> 00:32:09,036
this for Project
Zero if you want.

752
00:32:09,476 --> 00:32:13,076
That allows you to specify
a synonym for an IP address.

753
00:32:13,556 --> 00:32:15,646
So, what I have done
is edited at c host,

754
00:32:15,746 --> 00:32:20,046
to say that this IP address
w.x.y.z should actually be known

755
00:32:20,046 --> 00:32:21,446
as appliance; why?

756
00:32:21,736 --> 00:32:22,736
Just more user friendly.

757
00:32:23,126 --> 00:32:26,806
It would be wrong for me
type localhost/PHPMadmin

758
00:32:26,806 --> 00:32:28,066
on my Mac why?

759
00:32:28,646 --> 00:32:28,736
Yeah.

760
00:32:29,696 --> 00:32:31,586
>> Because that would
refer back to your Mac

761
00:32:33,716 --> 00:32:33,906
which you [inaudible].

762
00:32:34,026 --> 00:32:37,226
>> Exactly, localhost if I'm
on my Mac refers obviously

763
00:32:37,226 --> 00:32:39,916
to my Mac cause that
is indeed the localhost

764
00:32:40,296 --> 00:32:43,466
and there is no PHPMyadmin on
there, at least by default.

765
00:32:43,466 --> 00:32:45,376
You can install it, you
can use XAMPP or WAMP

766
00:32:45,376 --> 00:32:47,946
or these various tools we talked
about briefly in the first week

767
00:32:47,946 --> 00:32:49,916
as alternatives, but if
you're not running it

768
00:32:49,916 --> 00:32:52,296
on your actual Mac or PC
that's going to be a dead end,

769
00:32:52,296 --> 00:32:55,116
so instead we want to resolve
it to the IP address and as

770
00:32:55,116 --> 00:32:57,546
for the appliance it's
always going to change

771
00:32:57,976 --> 00:33:00,566
but you can always check
the appliances IP address

772
00:33:00,886 --> 00:33:02,696
by checking the bottom
right-hand corner here

773
00:33:02,696 --> 00:33:04,966
and if you indeed have
network access on your computer

774
00:33:04,966 --> 00:33:07,526
and all is well and nothing's
broken, you will be able

775
00:33:07,526 --> 00:33:10,036
to visit the appliance
via that IP address

776
00:33:10,036 --> 00:33:13,786
which in this case
is 17216100.129,

777
00:33:13,786 --> 00:33:15,426
but will be different for you.

778
00:33:16,416 --> 00:33:21,386
Alright, so what can we do
when we're in this environment?

779
00:33:21,386 --> 00:33:23,576
Well, let me go back to my
Mac and why do I use my Mac?

780
00:33:23,576 --> 00:33:26,146
It's just a little either,
they're functionally equivalent.

781
00:33:26,576 --> 00:33:27,786
So, what can we do here?

782
00:33:27,786 --> 00:33:30,496
Well, one now I'm being
told there's no databases,

783
00:33:30,616 --> 00:33:32,846
but that's because the
appliance is lying to you.

784
00:33:32,846 --> 00:33:35,576
We've hidden some of those more
administrative type databases

785
00:33:35,576 --> 00:33:38,186
just so you don't accidentally
break things or get distracted

786
00:33:38,346 --> 00:33:40,706
by things you don't care about
and I'm going to go ahead

787
00:33:40,706 --> 00:33:42,506
and click databases at top left

788
00:33:43,096 --> 00:33:45,346
and you'll see I'm prompted
to create a database.

789
00:33:45,346 --> 00:33:48,506
I'm going to go ahead and create
a database called jharvard

790
00:33:48,506 --> 00:33:51,306
underscore lecture and notice

791
00:33:51,356 --> 00:33:54,076
that I'm choosing my user name
underscore and then the name

792
00:33:54,076 --> 00:33:55,746
of the database that
I actually want

793
00:33:55,746 --> 00:33:58,656
which is a common convention
and I'm going click create

794
00:33:59,006 --> 00:34:02,036
and now notice what
PHPMyadmin did.

795
00:34:02,036 --> 00:34:05,026
PHPMyadmin is coincidentally
written in PHP.

796
00:34:05,156 --> 00:34:06,426
We don't care that it's written

797
00:34:06,426 --> 00:34:09,186
in PHP that's just what the
person who wrote it called it.

798
00:34:09,186 --> 00:34:12,736
It, however, is a GUI graphical
user interface that allows you

799
00:34:12,736 --> 00:34:14,716
to administer a MySQL server

800
00:34:14,946 --> 00:34:16,526
that could be running
really anywhere,

801
00:34:16,526 --> 00:34:18,086
in this case it's
running on my appliance,

802
00:34:18,506 --> 00:34:20,936
but if you have a commercial
web post they might say,

803
00:34:20,936 --> 00:34:25,886
hey we support PHPMyadmin go
to something .com/PHPMYadmin

804
00:34:25,886 --> 00:34:27,896
and then you log in with
your commercial user name

805
00:34:27,896 --> 00:34:28,936
and password in that case.

806
00:34:29,416 --> 00:34:31,646
So, you'll see hopefully
that this is just a nice way

807
00:34:31,646 --> 00:34:33,806
of navigating a database
especially once you start

808
00:34:33,806 --> 00:34:36,066
getting data in it cause you
can just see it visually rather

809
00:34:36,066 --> 00:34:39,006
than in black and white text and
also pedagogically we'll be able

810
00:34:39,006 --> 00:34:42,406
to see the features of MySQL
thanks some nice dropdowns

811
00:34:42,406 --> 00:34:43,116
in just a moment.

812
00:34:43,476 --> 00:34:44,856
So, this text here in purple

813
00:34:44,856 --> 00:34:48,136
and green create
database/jharvard

814
00:34:48,136 --> 00:34:52,536
underscorelecture/;
that is sequel code.

815
00:34:52,926 --> 00:34:57,156
So, if I instead go back to
my terminal window, my black

816
00:34:57,156 --> 00:34:58,596
and white prompt from before,

817
00:34:58,956 --> 00:35:01,326
I could have finished this
sentence and I could have said,

818
00:35:01,326 --> 00:35:05,476
create database jharvard
underscorelecture`;

819
00:35:06,216 --> 00:35:09,886
and notice it fails, why?

820
00:35:09,996 --> 00:35:12,496
>> Because it says it exists.

821
00:35:12,496 --> 00:35:13,296
>> I already exists, right?

822
00:35:13,296 --> 00:35:16,546
And if I change the name
like lecture 2, enter.

823
00:35:16,916 --> 00:35:17,896
That seemed to be okay

824
00:35:17,896 --> 00:35:21,316
and notice the feedback is
literally query okay one row

825
00:35:21,316 --> 00:35:24,836
affect and it did it really fast
zero seconds now let's go back

826
00:35:24,836 --> 00:35:27,896
to PHPMyadmin, let's go
back to the home icon here

827
00:35:27,896 --> 00:35:29,776
and voila now I have
two databases.

828
00:35:30,036 --> 00:35:32,976
If I want to delete it I can
go ahead and click here and,

829
00:35:32,976 --> 00:35:34,506
or actually let me go here.

830
00:35:34,706 --> 00:35:39,086
If I go to the databases tab
noice I see my two databases,

831
00:35:39,086 --> 00:35:42,696
I can check lecture 2, click
drop which is the lexicon

832
00:35:42,696 --> 00:35:45,826
for removing a database,
notice it's yelling at me

833
00:35:45,826 --> 00:35:47,996
"are you sure you want
to execute this command?"

834
00:35:48,466 --> 00:35:50,866
And I am, and in fact I'm going
to do it the other way though,

835
00:35:50,866 --> 00:35:57,316
I'm going to go here and say
drop database jharvard lecture

836
00:35:57,316 --> 00:36:03,866
2; query okay, back here,
reload and click the home icon

837
00:36:04,266 --> 00:36:06,156
and sometimes it caches
so you have to reload.

838
00:36:06,156 --> 00:36:07,556
Oh, did I delete the wrong one?

839
00:36:08,976 --> 00:36:10,006
I did, that's okay.

840
00:36:10,236 --> 00:36:11,576
Here's how you rename databases.

841
00:36:12,016 --> 00:36:15,176
So, now I'm going to
go up to lecture 2,

842
00:36:15,456 --> 00:36:17,806
now notice there's a bunch of
tabs and we won't play with all

843
00:36:17,806 --> 00:36:19,676
of them today and most,
for the most part they're

844
00:36:19,676 --> 00:36:21,806
self-explanatory once you
get comfortable exploring.

845
00:36:21,806 --> 00:36:23,226
I'm going to click operations

846
00:36:23,486 --> 00:36:26,646
and now notice I can just change
this here jharvard underscore

847
00:36:26,646 --> 00:36:29,496
lecture and there's some
other stuff I can drop it,

848
00:36:29,686 --> 00:36:32,496
I can create a table in
it, I can copy the database

849
00:36:32,496 --> 00:36:36,056
and so forth, but let me
click rename and now notice,

850
00:36:36,536 --> 00:36:39,016
and this where it's
pedagogically instructive even

851
00:36:39,016 --> 00:36:42,406
though we're using this click,
user friendly click and drag

852
00:36:42,406 --> 00:36:45,286
and interface, notice that
it's telling us the SEQUEL

853
00:36:45,446 --> 00:36:47,246
that you could have
executed manually

854
00:36:47,246 --> 00:36:49,116
so you can infer what
the syntax is like,

855
00:36:49,416 --> 00:36:51,446
create database jharvard
lecture,

856
00:36:51,636 --> 00:36:55,766
drop database jharvard lecture 2
is the way it chose to rename it

857
00:36:55,816 --> 00:36:57,976
in this case, but it would
have preserved our data

858
00:36:58,076 --> 00:36:58,706
if we had any.

859
00:36:58,706 --> 00:36:59,416
Yeah, Conner.

860
00:36:59,416 --> 00:37:02,236
>> I noticed all these are,
all these commands are in caps,

861
00:37:02,236 --> 00:37:04,916
is that case sensitive?

862
00:37:04,916 --> 00:37:06,686
>> It is not, so even
though you're seeing them

863
00:37:06,686 --> 00:37:09,516
in purple capital letters,
that's typically a convention

864
00:37:09,516 --> 00:37:11,866
and it's a style thing that I
would actually encourage you

865
00:37:11,866 --> 00:37:13,966
to adopt because when
you're reading you code

866
00:37:13,966 --> 00:37:18,146
in a syntax highlighted
terminal IDE or code editor,

867
00:37:18,646 --> 00:37:20,376
or when you're just
reading it as a human

868
00:37:20,376 --> 00:37:23,586
with eyes it's just easier
to see the SEQUEL keywords

869
00:37:23,586 --> 00:37:26,816
as distinct from your own
table and column names

870
00:37:27,266 --> 00:37:29,166
which are generally
should be in lowercase

871
00:37:29,166 --> 00:37:31,006
with no spaces or
weird punctuation.

872
00:37:31,846 --> 00:37:32,446
Good question.

873
00:37:33,066 --> 00:37:34,946
Alright, so let's now use this,

874
00:37:34,946 --> 00:37:37,066
but let's first motivate
the problem somehow.

875
00:37:37,066 --> 00:37:41,516
I want to create a
database for authentication.

876
00:37:41,516 --> 00:37:43,956
Recall that on Monday
we looked at a number

877
00:37:43,956 --> 00:37:48,846
of PHP examples among which,
actually last week we looked

878
00:37:48,846 --> 00:37:52,806
at the log in examples where
we used session and a cookie

879
00:37:53,056 --> 00:37:56,916
and we remembered that jharvard
was logged in or not logged in

880
00:37:56,916 --> 00:37:59,216
and we went through a few
iterations log in 1, log in 2,

881
00:37:59,216 --> 00:38:01,026
log in 3 and so forth and none

882
00:38:01,026 --> 00:38:02,646
of those though used
the database,

883
00:38:02,646 --> 00:38:04,446
at best we used a constant

884
00:38:04,846 --> 00:38:06,766
and just hardcoded John
Harvard's user name

885
00:38:06,766 --> 00:38:09,066
and password, not very
scalable, not very conducive

886
00:38:09,066 --> 00:38:10,116
to having multiple users.

887
00:38:10,406 --> 00:38:11,456
So, now we have a database.

888
00:38:11,816 --> 00:38:14,026
So, now I have the ability
to store rows and columns

889
00:38:14,026 --> 00:38:15,856
of information related to users.

890
00:38:16,336 --> 00:38:18,776
So, that begs the question,
if you were storing a database

891
00:38:18,776 --> 00:38:20,636
of users even if you
just think of this now

892
00:38:20,636 --> 00:38:23,736
as an Excel spreadsheet,
what would the columns be

893
00:38:23,736 --> 00:38:26,596
in that spreadsheet that you
would want to remember for each

894
00:38:26,596 --> 00:38:29,046
of your users which is
going to represent a row,

895
00:38:29,046 --> 00:38:32,466
so users go in rows, but
what each column represent

896
00:38:32,466 --> 00:38:34,146
if this is a spreadsheet
of users?

897
00:38:35,306 --> 00:38:36,726
What kinds of fields?

898
00:38:37,016 --> 00:38:38,606
[ Inaudible Response ]

899
00:38:38,606 --> 00:38:39,866
Yeah, okay so let's do that.

900
00:38:39,866 --> 00:38:43,636
So, you might want a name,
okay so what else besides name?

901
00:38:44,046 --> 00:38:45,796
>> Location, phone number.

902
00:38:45,796 --> 00:38:47,286
>> Location, phone number.

903
00:38:48,346 --> 00:38:50,036
What else, remember that they
need to be able to log in?

904
00:38:50,196 --> 00:38:50,286
Yeah.

905
00:38:50,576 --> 00:38:50,906
>> Password.

906
00:38:51,306 --> 00:38:51,876
>> Password.

907
00:38:51,876 --> 00:38:53,106
So, we need some
kind of password

908
00:38:53,106 --> 00:38:54,766
in the system, so a few fields.

909
00:38:54,766 --> 00:38:55,416
So, let's do this.

910
00:38:55,726 --> 00:38:58,556
Let's create the first of our
spreadsheets called a table

911
00:38:58,766 --> 00:39:00,406
in a database and to be clear,

912
00:39:00,486 --> 00:39:02,026
the fact that we're
using a database

913
00:39:02,026 --> 00:39:04,766
that has tables means it's
a relational database,

914
00:39:04,766 --> 00:39:10,256
RDMS relational database
management system is the buzz

915
00:39:10,256 --> 00:39:12,866
word there and this just
means you're using tables

916
00:39:12,866 --> 00:39:14,976
to store your information
and we'll talk in the future

917
00:39:14,976 --> 00:39:17,946
about alternatives to this among
which are no SEQUEL databases

918
00:39:17,946 --> 00:39:20,346
or object oriented
databases or document stores

919
00:39:20,606 --> 00:39:23,116
which generally means you
just store data differently,

920
00:39:23,196 --> 00:39:24,496
you don't use rows and columns,

921
00:39:24,496 --> 00:39:28,196
you instead store actual say
PHP objects which is a bit

922
00:39:28,196 --> 00:39:29,946
of an oversimplification,
but that idea.

923
00:39:30,376 --> 00:39:32,806
Alright, so let's call
my table of users, again,

924
00:39:32,806 --> 00:39:35,276
I would say a common convention
is used all lowercase,

925
00:39:35,276 --> 00:39:37,266
no special characters
and so forth.

926
00:39:37,266 --> 00:39:38,516
You could call it users,

927
00:39:38,516 --> 00:39:40,686
it's a little messy I
wouldn't call it users.

928
00:39:40,826 --> 00:39:44,186
I wouldn't call it my users,
this is just bad, it will work

929
00:39:44,186 --> 00:39:46,286
but you'll have to quote the
string everywhere cause it has a

930
00:39:46,286 --> 00:39:47,766
space, so in short,

931
00:39:47,986 --> 00:39:50,796
best practice would generally
say you should lowercase simple

932
00:39:50,796 --> 00:39:52,786
words that says what they are,

933
00:39:52,956 --> 00:39:55,356
like don't call table,
call it users.

934
00:39:55,636 --> 00:39:56,626
Alright, how many columns?

935
00:39:56,626 --> 00:39:59,746
Well, let's keep this simple for
now and let's just go with two

936
00:39:59,856 --> 00:40:01,826
for the moment, username
and password

937
00:40:01,826 --> 00:40:03,236
and we can add to it later.

938
00:40:03,536 --> 00:40:05,816
So, let me now click
on go or hit enter

939
00:40:06,436 --> 00:40:10,406
and now I get this field,
this form that I can fill out,

940
00:40:10,576 --> 00:40:11,946
so I've not created
the table yet

941
00:40:11,946 --> 00:40:13,706
but I'm being asked a
few questions so let's go

942
00:40:13,706 --> 00:40:15,376
through this top to
bottom and here's

943
00:40:15,376 --> 00:40:18,396
where frankly PHPMyadmin
interface it's a little ugly

944
00:40:18,396 --> 00:40:21,236
and it needs some work, but it's
definitely more user friendly

945
00:40:21,336 --> 00:40:23,406
than the command line
for this particular task.

946
00:40:23,736 --> 00:40:26,416
So, I'm going to give this
field a name, username,

947
00:40:26,856 --> 00:40:28,436
this one the name password,

948
00:40:28,566 --> 00:40:30,686
though I could call them
whatever I want but like

949
00:40:30,686 --> 00:40:34,326
with variables in a language use
descriptive words and here is

950
00:40:34,326 --> 00:40:36,346
where now the dropdowns
get a little instructive.

951
00:40:36,616 --> 00:40:38,706
Turns out I have a
whole bunch of datatypes

952
00:40:38,996 --> 00:40:42,146
with which it's defining my
schema, so a schema refers

953
00:40:42,146 --> 00:40:43,916
to all of the decisions
we're about to make.

954
00:40:43,916 --> 00:40:45,706
What is the format
of your table?

955
00:40:46,236 --> 00:40:49,126
So, for username what
do you want to go

956
00:40:49,126 --> 00:40:51,256
with here in terms of datatype?

957
00:40:52,156 --> 00:40:53,356
>> Variable char.

958
00:40:54,506 --> 00:40:58,196
>> Yeah, so there's this thing
here variable char or VARCHAR.

959
00:40:58,396 --> 00:41:01,026
This is probably the best
candidate and so let's put

960
00:41:01,026 --> 00:41:02,096
that at top of our mental list

961
00:41:02,096 --> 00:41:04,036
but let's see what else we
might want to rule in or out.

962
00:41:04,336 --> 00:41:06,946
Text, sounds like it's
related so let's maybe keep

963
00:41:06,946 --> 00:41:08,096
that on the mental list.

964
00:41:08,506 --> 00:41:11,186
DATE, no. TINYINT all these
numbers are clearly wrong,

965
00:41:11,186 --> 00:41:14,856
BOOLEAN wrong, DATE and TIME
wrong, okay STRING looks

966
00:41:14,856 --> 00:41:17,116
like we might have to give this
one a bit of thought, CHAR,

967
00:41:17,116 --> 00:41:20,026
VARCHAR, TINYTEXT, TEXT,
MEDIUMTEXT, LONGTEXT

968
00:41:20,306 --> 00:41:22,256
and the rest down there
is some binary stuff.

969
00:41:22,926 --> 00:41:26,136
So, you say VARCHAR
variable char, why?

970
00:41:26,136 --> 00:41:27,476
Or what does that represent?

971
00:41:28,516 --> 00:41:29,306
What is a VARCHAR?

972
00:41:29,426 --> 00:41:35,796
>> If you specify that there
is a ten character law is not

973
00:41:35,796 --> 00:41:40,286
exactly ten maybe less than ten.

974
00:41:40,286 --> 00:41:40,526
>> Okay.

975
00:41:40,656 --> 00:41:42,036
>> Say one through 10.

976
00:41:42,036 --> 00:41:42,126
>> Okay.

977
00:41:42,126 --> 00:41:45,886
>> If is VARCHAR then
they must be three.

978
00:41:45,886 --> 00:41:46,356
>> Exactly.

979
00:41:47,236 --> 00:41:50,356
So, a VARCHAR we're about
to see also requires

980
00:41:50,356 --> 00:41:53,486
that we tell it how long
it should be maximally.

981
00:41:53,726 --> 00:41:56,966
So, variable char means it's a
string but of variable length,

982
00:41:56,966 --> 00:41:58,276
but you have to give
it an upper bound.

983
00:41:58,496 --> 00:42:00,856
So, as you proposed if we
say that this is a VARCHAR

984
00:42:00,856 --> 00:42:04,396
of size ten that means you can
store usernames of length one,

985
00:42:04,396 --> 00:42:07,076
or two, or three, or even
zero, or one, or two, or three,

986
00:42:07,076 --> 00:42:09,516
or four, five, six, seven,
eight, nine, ten but not eleven

987
00:42:09,616 --> 00:42:11,906
or beyond, but the
upside of this is

988
00:42:11,906 --> 00:42:14,506
that if you're username
is just foo,

989
00:42:14,806 --> 00:42:18,016
three letters it's only
going to use three characters

990
00:42:18,016 --> 00:42:20,016
and then you're going to
save those seven additional

991
00:42:20,016 --> 00:42:22,056
characters, seven
additional bytes or what not

992
00:42:22,346 --> 00:42:24,656
and so you have some space
saving, now for one row

993
00:42:24,656 --> 00:42:26,076
who cares, for ten
rows who cares,

994
00:42:26,336 --> 00:42:28,076
for a million rows
that's absolutely going

995
00:42:28,076 --> 00:42:28,856
to start adding up.

996
00:42:29,516 --> 00:42:31,896
But then CHAR contrary

997
00:42:31,896 --> 00:42:35,426
to the name it's not a single
CHAR you still specify the size

998
00:42:35,766 --> 00:42:38,216
but a CHAR field is a fixed
size, so if you say ten,

999
00:42:38,216 --> 00:42:39,256
it's going to use ten bytes.

1000
00:42:39,976 --> 00:42:42,966
If you say eleven,
eleven bytes or maybe more

1001
00:42:42,966 --> 00:42:45,556
if it's using sixteen bit
characters for other languages.

1002
00:42:46,006 --> 00:42:51,256
So, CHAR, VARCHAR why would
you choose one versus the other

1003
00:42:51,256 --> 00:42:52,896
or why does CHAR even exist

1004
00:42:52,896 --> 00:42:55,256
if clearly it's more
wasteful potentially?

1005
00:42:55,256 --> 00:42:55,416
Conner.

1006
00:42:55,566 --> 00:43:00,686
>> I mean maybe CHAR they're
all going to be that length.

1007
00:43:00,686 --> 00:43:02,376
>> So, maybe CHAR if they're
all going to be that length,

1008
00:43:02,376 --> 00:43:04,876
so that's good, so even
though users names might vary

1009
00:43:04,876 --> 00:43:06,816
in length, what might
be a piece of data,

1010
00:43:07,006 --> 00:43:08,376
even though we don't
have room for it yet,

1011
00:43:08,726 --> 00:43:10,316
that's a fixed length
that people have?

1012
00:43:10,746 --> 00:43:12,446
>> Middle initial.

1013
00:43:12,446 --> 00:43:12,696
>> I'm sorry.

1014
00:43:13,006 --> 00:43:13,566
>> Middle initial.

1015
00:43:13,886 --> 00:43:15,926
>> So, middle initial if
you just want one initial,

1016
00:43:15,926 --> 00:43:17,296
a single CHAR would
do the trick.

1017
00:43:17,296 --> 00:43:17,656
What else?

1018
00:43:18,016 --> 00:43:19,956
>> You wouldn't use
a CHAR for this,

1019
00:43:19,956 --> 00:43:22,676
but a date usually has the same.

1020
00:43:22,676 --> 00:43:24,816
>> Okay, so a date, now it
turns out there are date field

1021
00:43:24,856 --> 00:43:27,176
but something like that
that's a known fixed length,

1022
00:43:27,326 --> 00:43:28,646
anything else?

1023
00:43:29,576 --> 00:43:32,386
Maybe phone number,
you know, little tricky

1024
00:43:32,386 --> 00:43:34,196
with different country
codes and what not

1025
00:43:34,196 --> 00:43:35,306
which could be variable length

1026
00:43:35,306 --> 00:43:37,146
but for US numbers
you can use ten digits

1027
00:43:37,626 --> 00:43:39,406
for a state you could
maybe use seven

1028
00:43:39,406 --> 00:43:42,466
if it only has one zip code, so
there's some optimizations there

1029
00:43:42,466 --> 00:43:43,706
if you're not storing
the hyphens

1030
00:43:43,706 --> 00:43:45,326
and parentheses and
things like that.

1031
00:43:45,866 --> 00:43:49,676
But more compellingly, CHARs
are also useful for performance.

1032
00:43:49,676 --> 00:43:52,656
It turns out, and this is a
lower level detail, but it turns

1033
00:43:52,656 --> 00:43:55,276
out that if you do specify CHAR
because you know the length

1034
00:43:55,276 --> 00:43:59,436
in advance, the database can be
faster at searching that because

1035
00:43:59,436 --> 00:44:01,326
if you think of it just
in the most naïve way,

1036
00:44:01,746 --> 00:44:04,436
if you have variable length
CHARs in a column, you know,

1037
00:44:04,436 --> 00:44:05,746
the column's going
to look like this,

1038
00:44:05,746 --> 00:44:07,446
it's going to be a ragged column

1039
00:44:07,446 --> 00:44:09,636
which means this string is this
length, this one's this length,

1040
00:44:09,636 --> 00:44:12,716
this one, you know, it's kind of
uneven which means how do you go

1041
00:44:12,716 --> 00:44:14,096
from one string to another

1042
00:44:14,096 --> 00:44:16,226
if you're doing something
silly like linear search?

1043
00:44:16,636 --> 00:44:19,196
Well, you have to know the
length of each of those strings

1044
00:44:19,196 --> 00:44:21,676
and in the worst case you
have to search through each

1045
00:44:21,676 --> 00:44:24,546
of those strings looking for the
end of the string if you come

1046
00:44:24,546 --> 00:44:28,546
from C or C++/zero
represents the end of a string;

1047
00:44:28,756 --> 00:44:29,626
you essentially have to look

1048
00:44:29,626 --> 00:44:31,556
for that throughout
the column looking

1049
00:44:31,556 --> 00:44:32,846
for the beginnings of new words.

1050
00:44:33,116 --> 00:44:36,356
By contrast, if it's all
CHARs and of length ten,

1051
00:44:36,586 --> 00:44:39,366
the column now looks
beautiful like this even

1052
00:44:39,366 --> 00:44:42,226
if you're wasting some of
those bytes at least they're

1053
00:44:42,226 --> 00:44:45,106
on the same boundaries which
needs to go from this row

1054
00:44:45,106 --> 00:44:47,816
to this one or this string to
this string, it's just plus ten,

1055
00:44:47,816 --> 00:44:49,166
plus ten, plus ten, plus ten,

1056
00:44:49,356 --> 00:44:50,976
so you essentially
get random access

1057
00:44:50,976 --> 00:44:51,916
like you would in an array.

1058
00:44:52,536 --> 00:44:55,226
So, that's the tradeoff you
have to make and this is one

1059
00:44:55,226 --> 00:44:57,556
of these nonobvious decisions
just like in Project Zero,

1060
00:44:57,706 --> 00:44:59,706
there's no one right
answer to menu.XML,

1061
00:45:00,146 --> 00:45:01,856
similarly for Project
One there's going

1062
00:45:01,856 --> 00:45:04,136
to be no one right answer to
how you design your database

1063
00:45:04,136 --> 00:45:06,756
for that project but these are
the kinds of decisions you need

1064
00:45:06,756 --> 00:45:10,026
to think through and make
and sometimes struggle with

1065
00:45:10,026 --> 00:45:12,816
and even then you and your,
you know, business partner

1066
00:45:12,816 --> 00:45:14,176
if you go off after
this class and work

1067
00:45:14,176 --> 00:45:16,296
on something collaboratively,
might not agree

1068
00:45:16,536 --> 00:45:18,416
but hopefully you'll
at least arrive

1069
00:45:18,476 --> 00:45:21,016
to the table a little more
informed than him or her,

1070
00:45:21,016 --> 00:45:23,966
so CHAR versus VARCHAR
but it looks

1071
00:45:23,966 --> 00:45:27,306
like we have four other options,
TINYTEXT, TEXT, MEDIUMTEXT,

1072
00:45:27,306 --> 00:45:31,306
LONGTEXT so these are
generally bigger whereby,

1073
00:45:31,306 --> 00:45:37,136
I think text is usually
something like, it's big 30,

1074
00:45:37,136 --> 00:45:40,656
damn I can't remember, it's
like a few megabytes I think is

1075
00:45:40,656 --> 00:45:42,016
for a field like
that, so this is

1076
00:45:42,016 --> 00:45:44,306
like if you want people
uploading their resumes

1077
00:45:44,546 --> 00:45:47,626
for instance or some long
document or something,

1078
00:45:47,906 --> 00:45:50,816
or a huge HTML page that you've
screen scraped or something

1079
00:45:50,816 --> 00:45:52,746
like that, it might not be a few
megabytes, I'm misremembering,

1080
00:45:53,046 --> 00:45:54,216
but it's bigger than any

1081
00:45:54,216 --> 00:45:56,026
of the fields we just
discussed thus far.

1082
00:45:56,276 --> 00:45:58,456
Generally, CHAR,
VARCHAR are caps,

1083
00:45:58,456 --> 00:46:01,036
years ago the cap
was 255 characters,

1084
00:46:01,036 --> 00:46:06,396
these days it's 65,535 or
36 characters; TEXT is more

1085
00:46:06,396 --> 00:46:08,366
than that, I just don't
remember what it is off hand,

1086
00:46:08,366 --> 00:46:11,646
but the documentation would have
this, so the upside of this is

1087
00:46:11,646 --> 00:46:13,656
that you can store big
quantities of text.

1088
00:46:13,656 --> 00:46:14,906
Well that seems great, right?

1089
00:46:14,906 --> 00:46:18,186
If I, and they are variable
length, so they're like big,

1090
00:46:18,186 --> 00:46:21,156
big, big VARCHARs
so now pushback.

1091
00:46:21,156 --> 00:46:22,956
If you just have this
ability to just, hell,

1092
00:46:22,956 --> 00:46:26,016
let's use long text cause I have
no idea how long someone's name

1093
00:46:26,016 --> 00:46:26,356
is going to be.

1094
00:46:26,356 --> 00:46:28,326
I don't want to choose some
arbitrary cutoff like ten

1095
00:46:28,326 --> 00:46:31,476
or thirty-two or 255,
it will let it be as big

1096
00:46:31,476 --> 00:46:32,386
as the database supports.

1097
00:46:32,996 --> 00:46:35,976
What must the price be
that we're paying to have

1098
00:46:35,976 --> 00:46:38,066
that flexibility of
being able to store names

1099
00:46:38,066 --> 00:46:40,006
that are millions
of characters long?

1100
00:46:40,546 --> 00:46:50,826
What could it be?

1101
00:46:51,046 --> 00:46:52,906
Conner, take a guess anything.

1102
00:46:53,466 --> 00:46:58,736
>> So, so are you saying
like the downside to that?

1103
00:46:58,736 --> 00:47:01,286
>> Not so much, I'll, so sure
you can view it that way.

1104
00:47:01,286 --> 00:47:03,276
There's some price we're
paying, right, cause otherwise

1105
00:47:03,276 --> 00:47:05,546
if you could use long
text for everything,

1106
00:47:05,546 --> 00:47:06,626
it's begs the question,

1107
00:47:06,626 --> 00:47:08,376
why do any of the
other datatypes exist?

1108
00:47:08,446 --> 00:47:10,666
Now, we know why CHAR exists,
but what about VARCHAR?

1109
00:47:15,536 --> 00:47:16,356
>> Just because of
the performance,

1110
00:47:16,356 --> 00:47:19,496
I think it would slow it down
too much having that much data.

1111
00:47:19,496 --> 00:47:21,656
>> Okay, so that's not
bad, that is right,

1112
00:47:21,656 --> 00:47:25,376
so it is actually a performance
thing because text and long text

1113
00:47:25,376 --> 00:47:28,486
and even tiny text which is
still big, it's just smaller

1114
00:47:28,486 --> 00:47:30,616
than text is meant to
be pretty darn big.

1115
00:47:30,706 --> 00:47:33,906
For efficiency reasons when
you store the database on disc

1116
00:47:33,906 --> 00:47:35,566
and this is not a design
detail we developers have

1117
00:47:35,566 --> 00:47:36,716
to care about, it's something

1118
00:47:36,716 --> 00:47:38,516
that my MySQL people
have to care about.

1119
00:47:38,686 --> 00:47:40,146
When they implement
this database,

1120
00:47:40,146 --> 00:47:42,156
they pretty much store
the tables on disc

1121
00:47:42,156 --> 00:47:45,966
or in RAM contiguously and
conceptually this long column

1122
00:47:45,966 --> 00:47:50,146
of text, but when it's a really
long text what they instead do

1123
00:47:50,146 --> 00:47:53,216
is put in the column
a pointer effectively

1124
00:47:53,506 --> 00:47:55,506
to some huge chunk
text elsewhere,

1125
00:47:55,766 --> 00:47:57,106
so in other words the text

1126
00:47:57,106 --> 00:47:59,016
when it's really big is
not stored right there

1127
00:47:59,016 --> 00:48:01,086
so you can't just kind of
hop around looking for it,

1128
00:48:01,086 --> 00:48:02,706
you have to go here and
then look over here,

1129
00:48:02,856 --> 00:48:04,396
back to the table
look over here,

1130
00:48:04,396 --> 00:48:05,766
back to the table, over here.

1131
00:48:05,956 --> 00:48:10,076
So, in short, it's just not as
local to the rest of your data

1132
00:48:10,076 --> 00:48:11,926
as everything else so this
actually has real world

1133
00:48:11,926 --> 00:48:15,046
implications for caching, if
you're familiar with L1 caches,

1134
00:48:15,076 --> 00:48:17,746
L2 caches, some hardware type
things even in memory caches

1135
00:48:17,746 --> 00:48:19,476
that a database would
have, in short,

1136
00:48:19,556 --> 00:48:22,356
the farther away your data is,
the less likely it is to be

1137
00:48:22,356 --> 00:48:25,466
in RAM or in caches at any
given time, so in short,

1138
00:48:25,466 --> 00:48:28,486
this might just have performance
impact as Conner proposed

1139
00:48:28,486 --> 00:48:29,726
and you might not notice it

1140
00:48:29,726 --> 00:48:32,676
until you really have huge
tables, but it's again one

1141
00:48:32,676 --> 00:48:33,726
of those design decisions

1142
00:48:33,726 --> 00:48:35,696
where for the courses
projects certainly,

1143
00:48:35,906 --> 00:48:38,516
you really shouldn't be dabbling
in the text field unless you get

1144
00:48:38,516 --> 00:48:40,736
to the point of uploading,
you know, large documents

1145
00:48:40,736 --> 00:48:43,016
or screen scraping stuff
or just big corpses

1146
00:48:43,016 --> 00:48:43,976
of text then it's compelling.

1147
00:48:44,256 --> 00:48:47,496
Alright, so that's username,
let's go with VARCHAR and,

1148
00:48:48,296 --> 00:48:49,956
oh damn it, now we have
to have a conversation

1149
00:48:49,956 --> 00:48:53,166
about how long a username should
be and this too, not obvious,

1150
00:48:53,166 --> 00:48:54,456
but let's take a
suggestion or two.

1151
00:48:55,686 --> 00:48:57,726
How long should our
variable length username be?

1152
00:48:58,256 --> 00:48:58,356
Yeah.

1153
00:48:59,196 --> 00:49:02,376
>> Thirty-two characters.

1154
00:49:02,376 --> 00:49:04,296
>> Okay, 32 does
anyone disagree?

1155
00:49:04,696 --> 00:49:08,456
Yeah, frankly 31 is
pretty compelling.

1156
00:49:08,666 --> 00:49:11,596
I don't know anyone with
a 32 character username.

1157
00:49:11,966 --> 00:49:12,166
Jack.

1158
00:49:12,166 --> 00:49:16,206
>> I would say 15 cause I
don't know anyone who's goes

1159
00:49:16,206 --> 00:49:16,636
that far either.

1160
00:49:16,636 --> 00:49:19,066
>> Okay, 15 so I can actually
think of some undergrads

1161
00:49:19,066 --> 00:49:21,396
who have crazy long usernames
just because their first

1162
00:49:21,396 --> 00:49:23,736
and their last names
are really long.

1163
00:49:23,826 --> 00:49:26,056
So, 15 makes me a little
uncomfortable cause now we're

1164
00:49:26,056 --> 00:49:27,246
going to, you're going
to have like people

1165
00:49:27,246 --> 00:49:28,476
with stupid user
names where they're

1166
00:49:28,476 --> 00:49:30,456
like losing one or two letters.

1167
00:49:30,896 --> 00:49:33,876
So, maybe 32 but even there
it's like who knows what's best.

1168
00:49:33,876 --> 00:49:36,946
There's some sweet spot but you
don't want to error too high

1169
00:49:37,216 --> 00:49:40,116
because you will potentially
pay a performance impact

1170
00:49:40,116 --> 00:49:43,886
and not all the much, but it's
just you know we should to try

1171
00:49:43,886 --> 00:49:46,436
to keep it as close to
reality as we need to,

1172
00:49:46,586 --> 00:49:48,776
so 32 frankly is not bad
cause any more than that

1173
00:49:48,776 --> 00:49:50,906
and no one's probably ever
typing your email address

1174
00:49:50,906 --> 00:49:53,926
anyway, so 32 fields
at least reasonable.

1175
00:49:54,076 --> 00:49:55,326
Alright, how about password?

1176
00:49:55,326 --> 00:49:56,606
What datatype should
this thing be?

1177
00:49:57,086 --> 00:50:03,476
>> VARCHAR maybe?

1178
00:50:03,636 --> 00:50:05,256
>> Yeah, VARCHARs
fine and in fact,

1179
00:50:05,256 --> 00:50:07,276
for most of our test
fields VARCHAR is fine

1180
00:50:07,276 --> 00:50:08,786
and how long should
the password be?

1181
00:50:10,166 --> 00:50:12,986
I don't know, maybe 32 again,
you know that field's long.

1182
00:50:12,986 --> 00:50:15,536
Most people in this room
probably don't have 32 character

1183
00:50:15,536 --> 00:50:17,526
passwords unless your
super paranoid like Jack.

1184
00:50:17,526 --> 00:50:21,596
>> No, I was just going to
say that most websites say

1185
00:50:21,596 --> 00:50:24,036
that a minimum password
length is 8

1186
00:50:24,036 --> 00:50:28,646
and then I don't know anyone
who really goes beyond 16.

1187
00:50:28,646 --> 00:50:30,506
>> Okay, so 8-16, so
somewhere in there

1188
00:50:30,506 --> 00:50:32,616
and I'll compromise
this time, let's say 16.

1189
00:50:32,986 --> 00:50:34,006
Sure, that's fine.

1190
00:50:34,006 --> 00:50:35,816
So, your user name is more
secure than your password

1191
00:50:35,816 --> 00:50:37,876
in some sense, but
that's fine, so in short,

1192
00:50:37,876 --> 00:50:40,616
just decisions you need to
make and I would generally just

1193
00:50:40,686 --> 00:50:41,976
as a matter of being anal

1194
00:50:42,206 --> 00:50:44,386
at least choose some standard
numbers, powers of two

1195
00:50:44,386 --> 00:50:47,396
or just heuristic so that you're
not making judgment calls every

1196
00:50:47,396 --> 00:50:50,096
which way, just pick
some consistent pattern.

1197
00:50:50,096 --> 00:50:52,716
So, frankly I might even say
32 here just to be consistent,

1198
00:50:52,716 --> 00:50:55,806
but I have no good argument for
that other than the consistency.

1199
00:50:56,076 --> 00:50:58,396
Alright, let's see what else we
can choose here to the right.

1200
00:50:58,706 --> 00:51:00,416
So, now in the top
row we have username,

1201
00:51:00,416 --> 00:51:02,026
bottom row we have
password, default,

1202
00:51:02,316 --> 00:51:05,416
so it turns out that a
database unlike a CSV file,

1203
00:51:05,416 --> 00:51:09,216
unlike an XML file allows you
to specify default values,

1204
00:51:09,366 --> 00:51:11,466
though in fairness there's
ways to do this in XML

1205
00:51:11,466 --> 00:51:13,386
but it's not nearly as
straight forward as this.

1206
00:51:13,726 --> 00:51:16,946
So, should a user have a
default username whereby

1207
00:51:16,996 --> 00:51:19,856
if they don't provide a username
my database will just put

1208
00:51:19,856 --> 00:51:20,766
something there for them?

1209
00:51:21,106 --> 00:51:22,576
Well, you don't have
that much flexibility.

1210
00:51:22,576 --> 00:51:26,096
You can either choose as defined
which means we can all everyone,

1211
00:51:28,036 --> 00:51:31,596
for instance John Dow if they
don't give us a username,

1212
00:51:32,166 --> 00:51:35,436
or we can just say, ah that's
fine their username can be Null

1213
00:51:35,826 --> 00:51:38,776
or it can be a timestamp
which is just wrong here.

1214
00:51:39,046 --> 00:51:41,366
So, what's the right
decision in this dropdown

1215
00:51:41,366 --> 00:51:43,556
for default values for username?

1216
00:51:45,716 --> 00:51:47,466
This one I'd argue
there's a right answer

1217
00:51:47,896 --> 00:51:49,966
so don't guess wrong.

1218
00:51:51,026 --> 00:51:51,156
Yeah.

1219
00:51:51,996 --> 00:51:53,756
>> I would use none.

1220
00:51:53,976 --> 00:51:54,706
>> Yeah, good.

1221
00:51:54,706 --> 00:51:57,686
So, right if they're creating an
account they need a username you

1222
00:51:57,686 --> 00:52:00,396
can't just like give
them a fixed name

1223
00:52:00,396 --> 00:52:03,256
like John Dow cause
it's only going to work

1224
00:52:03,256 --> 00:52:05,756
for the first person, so
in this case it's none

1225
00:52:05,756 --> 00:52:07,706
which means they have to
fill in this field, yeah.

1226
00:52:08,436 --> 00:52:08,696
Axel.

1227
00:52:08,986 --> 00:52:11,836
>> The length of the password.

1228
00:52:12,356 --> 00:52:12,836
>> Okay.

1229
00:52:13,036 --> 00:52:18,456
>> If you, I mean [inaudible]
you don't store the password

1230
00:52:18,566 --> 00:52:19,176
in plain text.

1231
00:52:19,176 --> 00:52:19,326
>> Indeed.

1232
00:52:19,326 --> 00:52:23,236
>> You store with some kind
of encryption and if you use

1233
00:52:23,236 --> 00:52:25,256
like say MD5 it's
going to be 32.

1234
00:52:25,256 --> 00:52:25,726
>> I agree.

1235
00:52:25,726 --> 00:52:27,246
So, let's actually come
back to this concern.

1236
00:52:27,246 --> 00:52:29,566
Let's do it this sort of
naïve clear text way then

1237
00:52:29,566 --> 00:52:31,126
and then we'll come
back and realize shoot

1238
00:52:31,126 --> 00:52:33,546
that was really stupid and then
we'll actually refine that field

1239
00:52:33,546 --> 00:52:35,876
in particular and it also
allows an opportunity

1240
00:52:35,876 --> 00:52:37,656
to edit the tables which
you can do after the fact,

1241
00:52:37,786 --> 00:52:39,136
so we're not writing
this in stone.

1242
00:52:39,456 --> 00:52:42,286
And password value, probably
none either, we want them

1243
00:52:42,286 --> 00:52:43,536
to give us a password as well.

1244
00:52:43,836 --> 00:52:46,526
Coalition is generally
not worrisome,

1245
00:52:46,526 --> 00:52:49,046
you'll see that this just has to
do with encoding of characters.

1246
00:52:49,046 --> 00:52:51,316
You don't have to even
bother filling this in,

1247
00:52:51,316 --> 00:52:53,106
the database will
choose one for you,

1248
00:52:53,106 --> 00:52:55,296
don't worry when it says
it's Swedish, just happens

1249
00:52:55,296 --> 00:52:57,296
that the original author's
in MySQL were Swedish

1250
00:52:57,296 --> 00:53:00,656
and so the coalition by default
is a Swedish encoding set,

1251
00:53:00,776 --> 00:53:03,676
but there is perfect overlap
between that and English

1252
00:53:03,676 --> 00:53:04,666
so it's not a problem.

1253
00:53:04,666 --> 00:53:06,106
You don't have to
fret over that detail.

1254
00:53:06,576 --> 00:53:08,706
So, I've pretty much
always left coalition blank.

1255
00:53:08,906 --> 00:53:11,346
Now attributes, this
is unrelated for now,

1256
00:53:11,346 --> 00:53:12,276
but we'll come back to this.

1257
00:53:12,666 --> 00:53:17,416
You can have a binary attribute,
an unsigned, unsigned zero fill,

1258
00:53:17,416 --> 00:53:19,436
or on update current timestamp,
we'll come back to those,

1259
00:53:19,436 --> 00:53:22,266
but there some additional nice
fields that a database can do

1260
00:53:22,356 --> 00:53:25,046
for us automatically so we
don't have to write it in code.

1261
00:53:25,256 --> 00:53:28,846
Now, on the right-hand side
here, notice that we have,

1262
00:53:29,316 --> 00:53:32,996
whoops, notice that we have a
couple of remaining options,

1263
00:53:33,846 --> 00:53:39,716
the first of which is Null,
can this field be Null?

1264
00:53:39,916 --> 00:53:43,256
So, this is distinct from the
default value even though we saw

1265
00:53:43,256 --> 00:53:46,396
Null a moment ago, that just
meant make the default value

1266
00:53:46,396 --> 00:53:47,306
for this field Null.

1267
00:53:47,526 --> 00:53:51,406
This checkbox lets us make the
decision can it even be Null,

1268
00:53:51,516 --> 00:53:52,416
yes or no?

1269
00:53:52,716 --> 00:53:54,906
So, this is a good one
to make a decision on.

1270
00:53:54,906 --> 00:53:56,926
In this case I think
the story is the same,

1271
00:53:56,926 --> 00:53:59,756
no they cannot be Null so I
should not check this box,

1272
00:54:00,036 --> 00:54:03,246
but even if they could be Null
I still wouldn't want a default

1273
00:54:03,246 --> 00:54:06,126
value of Null, but sometimes
you might want field.

1274
00:54:06,256 --> 00:54:07,366
What's a situation?

1275
00:54:07,366 --> 00:54:09,416
Suppose we were making
more sophisticated users

1276
00:54:09,416 --> 00:54:11,616
that have more than just
users names and passwords,

1277
00:54:11,996 --> 00:54:13,756
what field might be
reasonable to have,

1278
00:54:13,756 --> 00:54:15,456
but allow the user
to leave it Null?

1279
00:54:15,456 --> 00:54:16,386
>> Like middle name.

1280
00:54:16,696 --> 00:54:18,566
>> Yeah, middle name not
everyone has a middle name,

1281
00:54:18,566 --> 00:54:18,976
something else?

1282
00:54:18,976 --> 00:54:23,746
>> When you did the signup thing
for the sports, it's captain

1283
00:54:23,746 --> 00:54:25,306
and if you're not captain.

1284
00:54:25,306 --> 00:54:26,046
>> Yeah, exactly.

1285
00:54:26,636 --> 00:54:28,526
So, captain could be a field.

1286
00:54:28,526 --> 00:54:30,546
It could be a Boolean field
where you at least have a zero

1287
00:54:30,546 --> 00:54:33,436
or a one there explicitly or it
could just be Null or non-Null

1288
00:54:33,436 --> 00:54:35,676
that would work too, so
in short, optional fields

1289
00:54:35,676 --> 00:54:37,936
where you don't want it to be
the empty string, you don't want

1290
00:54:37,936 --> 00:54:41,266
to waste any amount of space
representing an empty string,

1291
00:54:41,266 --> 00:54:42,776
you just want to say
there's nothing even here,

1292
00:54:43,036 --> 00:54:43,746
you can do that.

1293
00:54:43,966 --> 00:54:45,656
Now, index let's come back to

1294
00:54:45,656 --> 00:54:46,956
but we'll have a
discussion shortly

1295
00:54:46,956 --> 00:54:48,376
about primary unique index

1296
00:54:48,376 --> 00:54:51,766
and full text AI is
completely nondescriptive,

1297
00:54:51,886 --> 00:54:55,276
but it means autoincrement
it's also implacable here

1298
00:54:55,486 --> 00:54:57,576
but we'll see when
that might be useful

1299
00:54:57,576 --> 00:54:59,056
and comments is just
for us humans.

1300
00:54:59,496 --> 00:55:01,496
So, at this point I'm going
to go ahead to the bottom

1301
00:55:02,196 --> 00:55:06,666
of the box and I'm going to
say save and hit that button

1302
00:55:06,916 --> 00:55:10,076
and now notice what was just
inputted to the database.

1303
00:55:10,166 --> 00:55:13,316
Here, again, could have done all
of this manually at the black

1304
00:55:13,316 --> 00:55:15,266
and white prompt just
gets a little tedious

1305
00:55:15,266 --> 00:55:16,936
and frankly I never
remember the punctuation

1306
00:55:16,936 --> 00:55:18,086
so I still use tools like this.

1307
00:55:18,566 --> 00:55:21,106
Create table, jharvard
underscore lecture.

1308
00:55:21,106 --> 00:55:24,006
These are back ticks, so for US
keyboards this is the character

1309
00:55:24,006 --> 00:55:27,676
usually on the top left of
your keyboard near the tilde,

1310
00:55:28,286 --> 00:55:31,436
that is my MySQL's
way of escaping things

1311
00:55:31,436 --> 00:55:33,626
that might even have single
or double quotes in them,

1312
00:55:33,626 --> 00:55:35,446
so just realize that's
a MySQL thing,

1313
00:55:35,446 --> 00:55:38,106
these back ticks they
actually have meaning.

1314
00:55:38,106 --> 00:55:39,776
Dot user, so what
is this saying?

1315
00:55:40,046 --> 00:55:42,486
This is saying specifically
create a user's table

1316
00:55:42,486 --> 00:55:43,496
in what database?

1317
00:55:44,216 --> 00:55:45,056
Jharvard lecture.

1318
00:55:45,456 --> 00:55:48,126
Technically you can omit
jharvard underscore lecture.

1319
00:55:48,186 --> 00:55:50,136
if you just say users,

1320
00:55:50,406 --> 00:55:53,226
the database will assume
whatever database we are in now

1321
00:55:53,226 --> 00:55:54,926
and recall that I
clicked on my database

1322
00:55:54,926 --> 00:55:57,546
at the top left earlier,
so I've selected a database

1323
00:55:57,916 --> 00:55:59,696
so this is more rebus
than it needs to be.

1324
00:56:00,076 --> 00:56:03,576
User name now is of type
VARCHAR 32, the spaces are

1325
00:56:03,576 --> 00:56:05,596
of relevant this is just
a pretty printing thing.

1326
00:56:05,866 --> 00:56:09,216
Not Null, obviously just
reiterating what we said before

1327
00:56:09,216 --> 00:56:10,766
and password has the
same constraints.

1328
00:56:10,986 --> 00:56:12,466
Engine we'll talk
about next week,

1329
00:56:12,466 --> 00:56:13,736
but there are different formats

1330
00:56:14,486 --> 00:56:16,126
in which you can
store database tables,

1331
00:56:16,126 --> 00:56:18,726
one of them is innoDB
another MyISAM,

1332
00:56:18,726 --> 00:56:22,306
another is HEAP there's a
whole bunch of others as well.

1333
00:56:22,636 --> 00:56:26,016
For now, know that they just
have to do with performance

1334
00:56:26,296 --> 00:56:29,596
and certain fancy features
and you can think of an analog

1335
00:56:29,596 --> 00:56:33,356
as being a Mac OS has
HFS+ for instance,

1336
00:56:33,396 --> 00:56:37,426
the Mac OS file system, Windows
has NTFS or FAT32, FAT16.

1337
00:56:37,736 --> 00:56:39,586
The database engine
is similar in spirit.

1338
00:56:39,586 --> 00:56:41,726
You can still store data
in different engines,

1339
00:56:41,866 --> 00:56:44,276
but they have different features
just like you can store files

1340
00:56:44,276 --> 00:56:46,586
on Macs and a PCs even though
those file systems have

1341
00:56:46,586 --> 00:56:49,786
different features, so we'll
come back to that detail.

1342
00:56:50,256 --> 00:56:53,236
Alright, so now down here we
see our little Swedish table

1343
00:56:53,506 --> 00:56:57,086
that has usernames and
passwords, the schema therefore,

1344
00:56:57,446 --> 00:56:58,686
but no rows and columns.

1345
00:56:58,686 --> 00:56:59,446
How do I know that?

1346
00:56:59,606 --> 00:57:00,896
Well, if I go up at top and try

1347
00:57:00,896 --> 00:57:03,486
to click browse the GUI
is just yelling at me,

1348
00:57:03,486 --> 00:57:06,136
there the "table is empty", so
clicking browse gets me nowhere.

1349
00:57:06,506 --> 00:57:08,756
So, let's do this sort
of in a cheating fashion,

1350
00:57:08,756 --> 00:57:12,426
let's click on insert and
manually insert some users here,

1351
00:57:12,426 --> 00:57:15,056
so for my username field,
I'm going to type in a value

1352
00:57:15,056 --> 00:57:18,426
of jharvard, and for the
password I'm going type crimson

1353
00:57:18,946 --> 00:57:21,606
and then I could give it
another row and column,

1354
00:57:21,606 --> 00:57:23,986
this ignore thing, this
is again, this has nothing

1355
00:57:23,986 --> 00:57:25,696
to do intellectually
with databases,

1356
00:57:25,696 --> 00:57:27,576
this is just a GUI
that's making it easier

1357
00:57:27,576 --> 00:57:29,246
to put data into this database.

1358
00:57:29,666 --> 00:57:31,576
So, now I'm going to
go ahead and click go,

1359
00:57:32,156 --> 00:57:33,836
notice what just happened.

1360
00:57:33,836 --> 00:57:36,866
I'm being reminded of the
SEQUEL that I could have typed

1361
00:57:37,096 --> 00:57:39,486
if I wanted to, and what
happened here is a new

1362
00:57:39,826 --> 00:57:40,576
SEQUEL command.

1363
00:57:40,576 --> 00:57:42,956
We've seen create database,
we seen create table,

1364
00:57:43,146 --> 00:57:45,046
now we're seeing insert
which is definitely one

1365
00:57:45,046 --> 00:57:46,076
of the most popular ones.

1366
00:57:46,476 --> 00:57:50,206
Insert into jharvardl
ecture.users, and again,

1367
00:57:50,206 --> 00:57:53,006
which of those words
could we leave off?

1368
00:57:53,516 --> 00:57:57,496
"Jharvard lecture" and the
dot not strictly necessary;

1369
00:57:58,206 --> 00:57:59,766
("username", "password")

1370
00:58:00,016 --> 00:58:03,326
and in fact the quotes
are not always necessary,

1371
00:58:03,326 --> 00:58:05,616
it depends on if you have
special keywords of the like,

1372
00:58:05,676 --> 00:58:08,036
bur realize that
PHPMyadmin always puts them

1373
00:58:08,036 --> 00:58:08,746
for good measure.

1374
00:58:09,086 --> 00:58:09,896
Then values.

1375
00:58:10,096 --> 00:58:13,646
So, values is a little
cryptic in that it's not,

1376
00:58:13,646 --> 00:58:15,946
like there's no colon
separating keys from values,

1377
00:58:16,256 --> 00:58:18,826
but notice that username
was first, password.

1378
00:58:18,886 --> 00:58:22,756
So, similarly is jharvard
first, crimson in this case.

1379
00:58:23,326 --> 00:58:25,446
So, now let's do this
manually and notice

1380
00:58:25,446 --> 00:58:29,406
that PHPMyadmin has a SEQUEL
tab up here and that's indeed

1381
00:58:29,406 --> 00:58:32,686
where we just were, now
notice by default it puts

1382
00:58:32,686 --> 00:58:35,406
in this query select star
from users where one.

1383
00:58:35,776 --> 00:58:38,146
This is another common
SEQUEL command,

1384
00:58:38,366 --> 00:58:41,506
select and it is what you
use to search a database.

1385
00:58:41,926 --> 00:58:44,666
So, in this case select
star from users where one.

1386
00:58:45,086 --> 00:58:47,376
Translate that even if you've
never seen SEQUEL before just

1387
00:58:47,376 --> 00:58:48,936
based on instinct to English.

1388
00:58:49,076 --> 00:58:49,896
What is this doing?

1389
00:58:51,226 --> 00:58:51,706
Yeah, Conner.

1390
00:58:55,976 --> 00:58:57,526
>> Is it just saying
select all like items

1391
00:58:57,526 --> 00:58:58,706
and users that exist basically.

1392
00:58:58,706 --> 00:59:00,466
>> Exactly, select all users,
all fields, or all columns

1393
00:59:00,466 --> 00:59:02,366
from users that exist,

1394
00:59:02,366 --> 00:59:04,156
cause where one is
just obviously true.

1395
00:59:04,436 --> 00:59:05,686
So, we can actually
wittle this down,

1396
00:59:05,686 --> 00:59:08,276
the where one technically
doesn't need to be there,

1397
00:59:08,396 --> 00:59:11,186
it's pointing now that this,
you can have conditionals

1398
00:59:11,346 --> 00:59:14,366
in a sentence kind of like a
predicate that had in XPath.

1399
00:59:14,516 --> 00:59:17,736
So, this is equivalent and you
know what's also equivalent here

1400
00:59:17,856 --> 00:59:21,306
is select username,
password from users.

1401
00:59:22,286 --> 00:59:25,856
That also is equivalent, the
star is nice shorthand notion

1402
00:59:25,856 --> 00:59:28,376
at least when you have
a bunch of fields.

1403
00:59:28,806 --> 00:59:30,626
So, now let me click go
and see what we get back.

1404
00:59:31,626 --> 00:59:34,126
So, now notice it's a
little overwhelming in terms

1405
00:59:34,126 --> 00:59:35,096
of the GUI, the only thing

1406
00:59:35,096 --> 00:59:38,106
that matters is the temporary
table that came back.

1407
00:59:38,106 --> 00:59:43,726
What select does is it selects
rows from your table and returns

1408
00:59:43,726 --> 00:59:45,346
to effectively a
temporary table.

1409
00:59:45,616 --> 00:59:47,836
Now, at the moment this
is really uninteresting

1410
00:59:47,836 --> 00:59:50,736
because I selected everything
so it's equivalent to what I saw

1411
00:59:50,736 --> 00:59:52,936
when I browsed this
table a moment ago.

1412
00:59:53,106 --> 00:59:55,006
Let's see in another
format just so as not

1413
00:59:55,006 --> 00:59:56,496
to get too distracted
by the GUI.

1414
00:59:56,496 --> 01:00:00,176
If I go back to my terminal
window, let me now do show,

1415
01:00:00,416 --> 01:00:02,316
I'll do it in caps,
SHOW DATABASES;

1416
01:00:03,286 --> 01:00:05,716
notice if you forget
a semicolon,

1417
01:00:05,946 --> 01:00:07,306
you instead get another prompt

1418
01:00:07,306 --> 01:00:09,146
so you can then hit
the semicolon there

1419
01:00:09,456 --> 01:00:10,496
to make it behave.

1420
01:00:10,786 --> 01:00:15,186
So, now jharvard lecture,
so I could do select star

1421
01:00:15,296 --> 01:00:20,836
from jharvard lecture.users
where one;

1422
01:00:21,036 --> 01:00:24,446
enter and there's
my columns and rows.

1423
01:00:24,636 --> 01:00:26,476
So, this what I meant
by old school before,

1424
01:00:26,476 --> 01:00:29,576
like this really is ASCII Art
representing a database table

1425
01:00:29,576 --> 01:00:32,146
and this is fine for small
datasets, this is a nightmare

1426
01:00:32,146 --> 01:00:33,726
when you actually have
lots of rows and columns.

1427
01:00:33,726 --> 01:00:35,696
You just can't do it at
the command line like this,

1428
01:00:36,046 --> 01:00:39,476
but now let's trim this query to
be a little more user friendly.

1429
01:00:39,476 --> 01:00:41,186
I don't need this
useless predicate,

1430
01:00:41,566 --> 01:00:44,596
I also don't need this or do I?

1431
01:00:44,806 --> 01:00:46,496
Let's see select
star from users.

1432
01:00:47,916 --> 01:00:49,406
So, I haven't selected
a database,

1433
01:00:49,976 --> 01:00:56,226
so I can actually do this use
jharvard underscore lecture;

1434
01:00:56,226 --> 01:00:59,496
so now notice databased changed.

1435
01:00:59,706 --> 01:01:01,816
I can hit up and down to
go through my history,

1436
01:01:01,816 --> 01:01:05,576
so now if I do select star from
users, enter, now it works.

1437
01:01:05,816 --> 01:01:06,806
So, why is this relevant?

1438
01:01:06,936 --> 01:01:09,156
PHP is going to have
the same exact mechanism

1439
01:01:09,156 --> 01:01:11,266
where you first connect
to a database like I did

1440
01:01:11,266 --> 01:01:13,796
at the command line and
then you select a database

1441
01:01:14,016 --> 01:01:16,876
and then you don't have to worry
about hardcoding you table,

1442
01:01:17,116 --> 01:01:20,046
your database name all over
the place which gets tedious

1443
01:01:20,186 --> 01:01:22,236
and also means if you ever
change your database name

1444
01:01:22,236 --> 01:01:24,116
because you move your code
from one server to another,

1445
01:01:24,416 --> 01:01:27,286
you have to change hundreds of
lines of code potentially or one

1446
01:01:27,286 --> 01:01:31,066
if you factored it out
to a configuration file.

1447
01:01:31,286 --> 01:01:33,726
Alright, so in short,
where are going with this?

1448
01:01:33,806 --> 01:01:37,386
So, SEQUEL has a whole
bunch of common statements;

1449
01:01:37,856 --> 01:01:40,246
create which we've seen,
haven't seen altered yet;

1450
01:01:40,396 --> 01:01:42,496
kind of saw drop cause
I did it manually

1451
01:01:42,496 --> 01:01:45,626
to delete a database I didn't
want; select we're going to see,

1452
01:01:45,626 --> 01:01:49,176
we saw insert, we'll see update;
delete and then a whole bunch

1453
01:01:49,176 --> 01:01:50,836
of others but why
don't we go ahead here,

1454
01:01:50,836 --> 01:01:57,686
take five minute break and
then regroup after that.

1455
01:01:57,886 --> 01:01:59,266
Alright so we are back.

1456
01:01:59,266 --> 01:02:00,816
So, let's actually do
something with this,

1457
01:02:00,816 --> 01:02:02,276
so thus far all we've
done is play

1458
01:02:02,276 --> 01:02:05,016
around with the database both
the command line and the GUI,

1459
01:02:05,016 --> 01:02:06,616
but let's actually
try talking now

1460
01:02:06,616 --> 01:02:08,236
to the database with
actual code.

1461
01:02:08,566 --> 01:02:11,486
So, let me go ahead and open
up a somewhat familiar example

1462
01:02:11,486 --> 01:02:17,736
from a couple of lectures
ago namely these guys here.

1463
01:02:18,936 --> 01:02:20,796
Let me go ahead and pull

1464
01:02:20,796 --> 01:02:29,436
up in the appliance today's
source code and let me go

1465
01:02:29,436 --> 01:02:31,616
into the log in directory
and notice

1466
01:02:31,616 --> 01:02:33,026
that we have these
files this time.

1467
01:02:33,026 --> 01:02:35,036
So, a couple of these
are familiar home.PHP

1468
01:02:35,036 --> 01:02:38,136
and probably logout.PHP, but
we left off with 1, 2, 3,

1469
01:02:38,216 --> 01:02:41,356
4 last time, now we have a
few more variants, 5, 6, 7,

1470
01:02:41,356 --> 01:02:43,956
and 8 which now actually
introduce a database.

1471
01:02:44,166 --> 01:02:46,126
So, let's see what
home.PHP looks like.

1472
01:02:46,516 --> 01:02:51,866
Let me go over to my appliance,
go into the root here,

1473
01:02:52,266 --> 01:02:55,446
choose log in and choose
home and to be clear now

1474
01:02:55,446 --> 01:02:57,786
that you do have presumably
or will have this week

1475
01:02:57,786 --> 01:03:00,046
for the project the appliance,
realize that you can download

1476
01:03:00,046 --> 01:03:01,226
of these examples, put them

1477
01:03:01,226 --> 01:03:03,146
in John Harvard's
public HTML directly

1478
01:03:03,416 --> 01:03:05,376
or you can create a Vhost
like you were guided

1479
01:03:05,476 --> 01:03:07,886
through for Project Zero and you
can call it whatever you want.

1480
01:03:07,886 --> 01:03:10,026
I called mine appliance and
you can actually play with all

1481
01:03:10,026 --> 01:03:11,346
of these examples hands on.

1482
01:03:11,556 --> 01:03:13,056
Alright, so you are
not logged in

1483
01:03:13,316 --> 01:03:16,736
and does anyone recall how
this file, home.PHP knows

1484
01:03:16,736 --> 01:03:17,906
that I am not logged in?

1485
01:03:17,906 --> 01:03:17,986
Yeah.

1486
01:03:17,986 --> 01:03:22,016
>> So, it looked into sessions
super global and looked

1487
01:03:22,016 --> 01:03:24,946
after log in true or something.

1488
01:03:24,946 --> 01:03:26,816
>> Exactly, so it looked
in the session super global

1489
01:03:26,816 --> 01:03:28,736
and it checked for a flag
called authenticated,

1490
01:03:28,736 --> 01:03:31,506
a key called authenticated
and if it's set there

1491
01:03:31,736 --> 01:03:34,256
that means the user
is indeed logged in.

1492
01:03:34,546 --> 01:03:37,306
Alright, so let's go ahead
and try logging in here.

1493
01:03:37,306 --> 01:03:41,386
Version 5 has this problem
though at the moment.

1494
01:03:41,386 --> 01:03:44,446
So, not bad it's
deliberately supposed to happen

1495
01:03:44,446 --> 01:03:46,966
because I haven't provided any
of the username or password

1496
01:03:46,966 --> 01:03:49,316
or database details
that we just created

1497
01:03:49,506 --> 01:03:52,346
on the fly using PHPMyadmin and
you can kind of infer as much

1498
01:03:52,346 --> 01:03:57,136
in orange, MySQL connect, access
denied for user "at localhost"

1499
01:03:57,136 --> 01:03:59,346
so that already doesn't
look right.

1500
01:03:59,446 --> 01:04:01,556
Using password no, that
doesn't sound right.

1501
01:04:01,556 --> 01:04:03,236
So, a whole bunch
of problems all

1502
01:04:03,236 --> 01:04:06,006
of which can be traced
back apparently line 17.

1503
01:04:06,236 --> 01:04:13,206
So, let me go into the code in
line 17 with Geany on log in 5

1504
01:04:14,196 --> 01:04:17,246
and we'll see here in my text
editor, whoops, we'll see here

1505
01:04:17,246 --> 01:04:23,256
in my text editor
following source code.

1506
01:04:24,356 --> 01:04:26,046
So, how is this starting?

1507
01:04:26,376 --> 01:04:28,316
So, first I'm calling
session start.

1508
01:04:28,316 --> 01:04:29,196
Let me zoom in.

1509
01:04:30,076 --> 01:04:31,816
What is session start
doing for me again?

1510
01:04:31,816 --> 01:04:32,066
Yeah, Axel.

1511
01:04:32,636 --> 01:04:35,706
>> It's starting the session so.

1512
01:04:35,706 --> 01:04:38,176
>> I could have told you that.

1513
01:04:38,176 --> 01:04:40,236
>> But that they
connect the super globals

1514
01:04:40,236 --> 01:04:43,416
and everything after
you do that.

1515
01:04:43,416 --> 01:04:46,206
>> Okay, good so it's enabling
you to access the super global

1516
01:04:46,206 --> 01:04:47,436
as dollar sign underscore
session.

1517
01:04:47,676 --> 01:04:48,506
Let's be more concrete.

1518
01:04:48,506 --> 01:04:51,166
What is it functionally
doing underneath the hood?

1519
01:04:51,936 --> 01:04:53,706
How is it achieving
that end result?

1520
01:04:59,136 --> 01:05:01,396
Put another way, how
do sessions work?

1521
01:05:02,206 --> 01:05:03,166
How do they get started?

1522
01:05:03,696 --> 01:05:03,776
Huh?

1523
01:05:10,376 --> 01:05:14,856
>> I believe, I think
something is sent in the,

1524
01:05:15,576 --> 01:05:17,766
from when you request a page

1525
01:05:18,136 --> 01:05:21,816
and the session start is
included the things you actually

1526
01:05:21,816 --> 01:05:23,456
receive has like a session ID.

1527
01:05:23,816 --> 01:05:27,206
>> Good. So, calling
session start ensures

1528
01:05:27,456 --> 01:05:30,776
that the server will send
if necessary a set cookie,

1529
01:05:30,986 --> 01:05:36,146
HTP header with a PHP ses ID
cookie whose value is a big

1530
01:05:36,146 --> 01:05:39,286
random sequence of letters and
numbers and will also ensure

1531
01:05:39,286 --> 01:05:42,746
that if the user sent
us a cookie:header,

1532
01:05:42,746 --> 01:05:45,266
so not set cookie,
but cookie:header

1533
01:05:45,616 --> 01:05:49,766
and that T-value is present
then it will use that key value

1534
01:05:49,966 --> 01:05:52,046
to look by default in slash temp

1535
01:05:52,336 --> 01:05:55,516
where there are files
named almost identically

1536
01:05:55,516 --> 01:05:58,396
to the cookie value, which
is again big random number,

1537
01:05:58,606 --> 01:06:02,716
and will give me the developer
the illusion of having access

1538
01:06:02,746 --> 01:06:04,456
to the same shopping
cart or what not

1539
01:06:04,456 --> 01:06:05,696
that I previously had access to.

1540
01:06:05,696 --> 01:06:08,596
Now, it's not a shopping cart
here, it's just a storage

1541
01:06:08,596 --> 01:06:10,976
for the authenticated
flag, but same idea.

1542
01:06:11,386 --> 01:06:13,756
Alright, so the next line
connect the database is

1543
01:06:13,756 --> 01:06:14,236
the comment.

1544
01:06:14,466 --> 01:06:16,566
So, there's a new function
here and we can do better

1545
01:06:16,566 --> 01:06:18,116
than this function, but
it turns out this is one

1546
01:06:18,116 --> 01:06:21,456
that you'll see fairly on
the presently in code online,

1547
01:06:21,456 --> 01:06:24,566
open source code, and the like
called MySQL underscore connect

1548
01:06:24,566 --> 01:06:28,056
but we'll better versions
next week that will hide some

1549
01:06:28,056 --> 01:06:30,016
of the details we'll now
get our hands dirty with.

1550
01:06:30,456 --> 01:06:33,466
So, MySQL connect takes three
arguments which you can see

1551
01:06:33,466 --> 01:06:36,786
in documentation if you pull
it up on PHP.net, but I happen

1552
01:06:36,786 --> 01:06:40,676
to remember that the first field
is the IP address or the name

1553
01:06:40,676 --> 01:06:43,696
of the server which in this
case is going to be localhost

1554
01:06:43,696 --> 01:06:46,856
because I'm in the appliance
and the database server happens

1555
01:06:46,916 --> 01:06:50,246
to be on the same physical
server as the web server,

1556
01:06:50,466 --> 01:06:51,836
so it doesn't need to
be one in the same,

1557
01:06:51,836 --> 01:06:52,766
but in this case it is.

1558
01:06:53,106 --> 01:06:56,776
My username recall is John
Harvard, my password is crimson

1559
01:06:57,006 --> 01:06:58,366
and what am I checking for here?

1560
01:06:58,366 --> 01:07:01,986
So, if connection equals MySQL
connect equals, equals, equals,

1561
01:07:01,986 --> 01:07:03,466
false so some new syntax.

1562
01:07:03,706 --> 01:07:06,036
So, first of all MySQL
connect is a function

1563
01:07:06,216 --> 01:07:07,346
that does exactly what it says.

1564
01:07:07,606 --> 01:07:09,826
It connects to the database
using those credentials

1565
01:07:10,166 --> 01:07:13,186
and that database
server, localhost.

1566
01:07:13,716 --> 01:07:14,516
What does it return?

1567
01:07:14,516 --> 01:07:17,036
It returns a pointer
to a connection object,

1568
01:07:17,036 --> 01:07:19,746
I don't really know what that
means yet, but it's some kind

1569
01:07:19,746 --> 01:07:21,226
of reference to the
open connection

1570
01:07:21,226 --> 01:07:25,546
so that I can subsequently send
commands to the database server

1571
01:07:26,146 --> 01:07:29,606
and then equals, equals,
false what is this here for?

1572
01:07:29,826 --> 01:07:32,756
Well, if something goes
wrong MySQL connect is going

1573
01:07:32,756 --> 01:07:35,156
to return false and it
actually did when we saw

1574
01:07:35,156 --> 01:07:38,436
that big orange error
message a moment ago, it was,

1575
01:07:38,436 --> 01:07:42,126
that was returning false but
why is it equals, equals,

1576
01:07:42,366 --> 01:07:44,326
equals feels like a typo?

1577
01:07:44,326 --> 01:07:44,576
Anyone know?

1578
01:07:44,736 --> 01:07:44,803
Yeah.

1579
01:07:44,803 --> 01:07:46,556
>> I think two equals
is the value

1580
01:07:47,256 --> 01:07:52,206
and two equals is also the type.

1581
01:07:52,406 --> 01:07:53,066
>> Exactly.

1582
01:07:53,066 --> 01:07:55,556
So, this is exists in other
languages to JavaScript

1583
01:07:55,556 --> 01:07:59,586
for instance, because PHP is
weakly typed whereby you don't

1584
01:07:59,586 --> 01:08:01,156
specify in sin floats

1585
01:08:01,156 --> 01:08:03,976
and strings explicitly you just
use variables however you want.

1586
01:08:04,816 --> 01:08:07,066
If you nonetheless
want to ensure

1587
01:08:07,186 --> 01:08:11,656
that you're testing a variable
against a value not only bases

1588
01:08:11,806 --> 01:08:17,696
on its bitwise, not only based
on its value like true or false,

1589
01:08:17,896 --> 01:08:20,706
but also based on its type
so that is has to be false

1590
01:08:20,706 --> 01:08:23,816
and it has to be a Boolean
you use equals, equals,

1591
01:08:23,896 --> 01:08:26,766
equals or not bang
equals, equals.

1592
01:08:26,766 --> 01:08:29,446
So, you use three characters
instead of the usual two,

1593
01:08:29,666 --> 01:08:32,006
so in short, this is the
identity operator it's not the

1594
01:08:32,006 --> 01:08:34,136
equality operator, it's
the identity operator

1595
01:08:34,136 --> 01:08:37,266
which means test for
equality and for type.

1596
01:08:37,626 --> 01:08:38,776
Now, why is this relevant?

1597
01:08:39,146 --> 01:08:40,596
Because if just use equals,

1598
01:08:40,596 --> 01:08:44,926
equals and MySQL
connect returns zero,

1599
01:08:45,326 --> 01:08:49,656
well zero is not false per se,
but it is pretty darn close

1600
01:08:49,656 --> 01:08:53,416
to false and in fact if you
convert an integer like zero

1601
01:08:53,416 --> 01:08:55,586
to a Boolean what do
you get typically?

1602
01:08:56,326 --> 01:08:56,526
>> False.

1603
01:08:56,636 --> 01:09:00,126
>> So, you get false but zero
is not really false and what if,

1604
01:09:00,206 --> 01:09:01,316
even though this
is not the case,

1605
01:09:01,436 --> 01:09:04,896
what if MySQL connects purpose
in life was to return integers?

1606
01:09:05,276 --> 01:09:07,876
Well, then you couldn't
distinguish the integer zero

1607
01:09:08,196 --> 01:09:13,536
from the Boolean return value of
false, so this is a little weird

1608
01:09:13,536 --> 01:09:15,396
in PHP in that you can
return multiple datatypes,

1609
01:09:15,396 --> 01:09:18,526
but it's a very common paradigm
and frankly it's a little useful

1610
01:09:18,526 --> 01:09:21,886
in that you can return one
datatype 99.9% of the time

1611
01:09:21,886 --> 01:09:23,386
like ints or strings
or the like,

1612
01:09:23,736 --> 01:09:25,536
but just in case
something goes wrong,

1613
01:09:25,776 --> 01:09:29,136
you can return a different
type like a Bool namely false.

1614
01:09:29,376 --> 01:09:33,966
So, this is different from a lot
of languages whereby if you want

1615
01:09:33,966 --> 01:09:36,166
to return a sentinel
value to signal an error,

1616
01:09:36,426 --> 01:09:39,026
you have to reserve
certain numbers for instance

1617
01:09:39,026 --> 01:09:41,606
or certain strings so it's
common convention in a lot

1618
01:09:41,606 --> 01:09:44,616
of languages to return negative
one to signify an error.

1619
01:09:44,826 --> 01:09:47,736
The problem with that of course
is that you're killing half

1620
01:09:47,736 --> 01:09:48,726
of your address space.

1621
01:09:48,826 --> 01:09:51,566
If you're reserving all two
billion negative integers

1622
01:09:51,806 --> 01:09:53,346
for error messages,
I mean that's a lot

1623
01:09:53,346 --> 01:09:56,056
of potential mistakes
and you're killing half

1624
01:09:56,056 --> 01:09:57,746
of your possible address space,

1625
01:09:57,746 --> 01:10:01,036
so this is much nicer even
though it's a little bit messy.

1626
01:10:01,836 --> 01:10:05,066
Alright, so connections;
suppose it goes through,

1627
01:10:05,066 --> 01:10:06,726
it doesn't return false
what do we have to do next?

1628
01:10:07,046 --> 01:10:08,796
Well, you can kind of guess
from this next line based

1629
01:10:08,796 --> 01:10:11,036
on the example we did
earlier, select database.

1630
01:10:11,606 --> 01:10:14,296
So, this just chooses a default
database so we don't have

1631
01:10:14,296 --> 01:10:15,866
to type the database
name all over the place

1632
01:10:15,866 --> 01:10:18,516
so that's helpful too and
now notice I'm passing

1633
01:10:18,516 --> 01:10:21,356
in two things, one of which I
need to fill in the blank for,

1634
01:10:21,556 --> 01:10:24,166
the other of which is a
reference to the previous line,

1635
01:10:24,486 --> 01:10:26,626
so now I'm going to go
ahead and pass in the name

1636
01:10:26,626 --> 01:10:28,446
of the database which was what?

1637
01:10:32,056 --> 01:10:33,416
>> Jharvard underscore lecture.

1638
01:10:33,416 --> 01:10:37,516
>> Yeah. Jharvard underscore
lecture and as an aside,

1639
01:10:37,826 --> 01:10:41,866
this is not strictly necessary,
if you don't bother passing

1640
01:10:41,866 --> 01:10:45,586
that in and you don't even
bother retaining that value,

1641
01:10:45,866 --> 01:10:50,396
MySQL connect and MySQL select
database will assume you want

1642
01:10:50,396 --> 01:10:52,636
the most recently open
connection that was just open

1643
01:10:52,636 --> 01:10:54,556
so that's a nice little
convenience, but I'll go back

1644
01:10:54,556 --> 01:10:56,486
to the original slightly
more rigorous way.

1645
01:10:56,816 --> 01:10:59,396
Alright, so now some of the code
is pretty familiar even though

1646
01:10:59,396 --> 01:11:00,286
it's been a few days.

1647
01:11:00,536 --> 01:11:04,166
If is set user and if is set
pass, what is the implication?

1648
01:11:04,336 --> 01:11:06,456
Well, that means that the
user submitted the form,

1649
01:11:06,616 --> 01:11:08,586
right and we had this
conversation a couple lectures

1650
01:11:08,586 --> 01:11:10,346
ago where this is
one way of inferring

1651
01:11:10,346 --> 01:11:12,216
that a form was in
fact submitted.

1652
01:11:12,516 --> 01:11:13,606
Alright, what comes next?

1653
01:11:13,606 --> 01:11:16,506
Well, now we have some SEQUEL
code and there's a few ways

1654
01:11:16,556 --> 01:11:20,566
to do this and let me do it the
wrong way first, so I'm going

1655
01:11:20,566 --> 01:11:22,556
to delete what's here now
and we're going to do it more

1656
01:11:22,556 --> 01:11:23,996
of a naïve way first.

1657
01:11:24,606 --> 01:11:26,916
So, let me go ahead and do this.

1658
01:11:27,556 --> 01:11:31,046
A SEQUEL variable and I'm
going to just do select star

1659
01:11:31,046 --> 01:11:36,756
from users where user
equals "jharvard" okay.

1660
01:11:36,756 --> 01:11:39,906
So, this line is obviously
wrong functionally,

1661
01:11:39,906 --> 01:11:41,496
syntactically it
is in fact correct,

1662
01:11:41,496 --> 01:11:44,046
but why is this obviously the
wrong way to implement this?

1663
01:11:44,136 --> 01:11:44,203
Yeah.

1664
01:11:44,203 --> 01:11:44,706
>> Because it's going to look

1665
01:11:44,706 --> 01:11:48,926
for jharvard no matter
what user [inaudible].

1666
01:11:49,006 --> 01:11:49,446
>> Exactly.

1667
01:11:49,446 --> 01:11:51,956
It's going to look for John
Harvard no matter what even

1668
01:11:51,956 --> 01:11:54,416
if the user is David,
or Chris, or Alon

1669
01:11:54,416 --> 01:11:55,966
or something else altogether.

1670
01:11:56,226 --> 01:11:57,306
So, obviously broken,

1671
01:11:57,476 --> 01:12:00,016
but at least it lets you see
the SEQUEL syntax a little more

1672
01:12:00,016 --> 01:12:00,896
clearly for the moment.

1673
01:12:01,006 --> 01:12:01,886
Now, what's the goal?

1674
01:12:01,886 --> 01:12:03,656
The goal at hand is
to determine whether

1675
01:12:03,656 --> 01:12:06,606
or not the person who's trying
to log in, their username

1676
01:12:06,606 --> 01:12:07,946
and their password is correct.

1677
01:12:08,236 --> 01:12:11,116
We can do this is a whole
bunch of ways, but intuitively

1678
01:12:11,476 --> 01:12:13,736
if I can find the
same user and password

1679
01:12:13,736 --> 01:12:16,806
in the database that's a pretty
good start because if I can find

1680
01:12:16,806 --> 01:12:18,936
that username and password it
means this person has typed

1681
01:12:18,936 --> 01:12:20,346
in someone's user name

1682
01:12:20,346 --> 01:12:22,636
and password correctly even
though maybe it's not actually

1683
01:12:22,636 --> 01:12:25,446
John Harvard, but that's a
different problem altogether,

1684
01:12:25,726 --> 01:12:27,606
so jharvard isn't quite right,

1685
01:12:28,036 --> 01:12:34,586
so I could do this $
underscore POST "username"

1686
01:12:34,896 --> 01:12:36,736
but there's a couple
things wrong with this.

1687
01:12:37,016 --> 01:12:38,866
One, you can tell from
the quotes I'm kind

1688
01:12:38,866 --> 01:12:40,306
of throwing off the
balance here.

1689
01:12:40,616 --> 01:12:43,596
Two, this is such an ugly
looking variable it needs

1690
01:12:43,596 --> 01:12:46,886
to be explicitly interpolated
by putting these curly quotes

1691
01:12:46,886 --> 01:12:49,686
around it, or curly
braces around it.

1692
01:12:49,846 --> 01:12:54,896
Alternatively, I could do this,
let me try this another way.

1693
01:12:55,246 --> 01:12:56,566
I could do ".username.'

1694
01:12:56,946 --> 01:13:03,336
so that's another way and this
is kind of a stylistic decision,

1695
01:13:03,656 --> 01:13:07,266
frankly you see this a lot even
though I find it harder to read.

1696
01:13:07,826 --> 01:13:12,146
So, any problems with this?

1697
01:13:12,336 --> 01:13:14,646
First, it's first noting the
single quote is important.

1698
01:13:14,646 --> 01:13:17,586
I this case doesn't matter
if I'm using double quotes

1699
01:13:17,776 --> 01:13:20,286
and the single quotes or single
quotes and then double quotes,

1700
01:13:20,386 --> 01:13:24,146
but anytime you are searching
based on a string you do need

1701
01:13:24,146 --> 01:13:25,706
to quote things in MySQL,

1702
01:13:25,706 --> 01:13:27,586
so this is important
having the single quotes

1703
01:13:27,586 --> 01:13:28,556
around the username.

1704
01:13:29,146 --> 01:13:30,816
So, that else could we do here?

1705
01:13:30,816 --> 01:13:33,756
Well, let me adopt this approach
and let me steal this code

1706
01:13:33,756 --> 01:13:35,796
and move this up a line.

1707
01:13:37,476 --> 01:13:39,096
So, what about this version?

1708
01:13:39,426 --> 01:13:41,936
So, you might not have seen this
function before, but it exists

1709
01:13:41,936 --> 01:13:45,316
in a bunch of languages
sprintf, stringprintf,

1710
01:13:45,766 --> 01:13:49,046
notice that I can use
printf like format strings

1711
01:13:49,046 --> 01:13:51,886
as I have here as
%s as a placeholder

1712
01:13:52,156 --> 01:13:55,236
and then sprintf plugs
in its second argument

1713
01:13:55,516 --> 01:13:57,696
to that placeholder and
then optionally it's third

1714
01:13:57,696 --> 01:14:00,296
and its fourth and so
forth, so why is the useful?

1715
01:14:00,296 --> 01:14:02,046
Really, this just is an
esthetic thing for me.

1716
01:14:02,316 --> 01:14:04,946
I actually find at the
moment the third line

1717
01:14:04,946 --> 01:14:08,346
in orange much more readable
that says select star from users

1718
01:14:08,636 --> 01:14:11,056
where a user equals "%s", why?

1719
01:14:11,306 --> 01:14:13,016
Cause I can kind of
read it all in one byte

1720
01:14:13,336 --> 01:14:16,046
and then I can mentally go
back and plug in the values

1721
01:14:16,106 --> 01:14:18,796
to the %s, but I'm also
doing something else.

1722
01:14:19,506 --> 01:14:21,616
One of the stupidest
names for a function ever,

1723
01:14:21,616 --> 01:14:26,086
but what do you think
it's doing?

1724
01:14:26,286 --> 01:14:27,806
What's the point of
this function called?

1725
01:14:27,806 --> 01:14:28,016
Isaac.

1726
01:14:28,216 --> 01:14:33,896
>> I think it's to, if the
username posts a malicious code.

1727
01:14:33,896 --> 01:14:36,476
>> Yeah, so if the user
somehow inserts malicious code.

1728
01:14:36,806 --> 01:14:40,456
Similar in spirit to the XSS
cross site scripting attack we

1729
01:14:40,456 --> 01:14:42,196
talked about earlier
albeit in the world

1730
01:14:42,196 --> 01:14:43,596
of JavaScript and HTML.

1731
01:14:43,986 --> 01:14:46,906
If they type in bad
characters, maybe quotes feels

1732
01:14:46,906 --> 01:14:48,496
like quotes can be a
dangerous character

1733
01:14:48,496 --> 01:14:49,896
if I'm using quotes
in my string.

1734
01:14:50,196 --> 01:14:51,816
There's other things
that can be dangerous.

1735
01:14:51,816 --> 01:14:54,696
What if the user says their
username is delete that feels

1736
01:14:54,696 --> 01:14:57,436
like it could be bad if I screw
up with the interpretation

1737
01:14:57,436 --> 01:15:00,536
of it, so in short MySQL
real escape string protects

1738
01:15:00,536 --> 01:15:03,436
against a class of attacks known
as SEQUEL injection attacks

1739
01:15:03,436 --> 01:15:05,696
and we'll come back
to those next week

1740
01:15:05,696 --> 01:15:09,306
in our security lecture, but
for now know that this is good

1741
01:15:09,546 --> 01:15:11,536
and leaving this off is bad.

1742
01:15:11,816 --> 01:15:13,876
Now, I say it's kind of
stupid name for a function

1743
01:15:13,876 --> 01:15:17,186
because years ago PHP had no
notion of name space support,

1744
01:15:17,396 --> 01:15:20,336
did not have object oriented
support and so the way

1745
01:15:20,336 --> 01:15:24,576
in which they added more and
more and more functionality

1746
01:15:24,576 --> 01:15:27,646
to PHP is they arbitrarily use
this underscore convention,

1747
01:15:27,646 --> 01:15:29,546
so all of the MySQL
functions begin

1748
01:15:29,546 --> 01:15:32,266
with MySQL underscore something,

1749
01:15:32,726 --> 01:15:35,556
why they called it MySQL real
escape string is a ridiculous.

1750
01:15:35,606 --> 01:15:37,676
There was a MySQL
escape string function

1751
01:15:37,676 --> 01:15:39,466
which is apparently
a little bit flawed

1752
01:15:39,466 --> 01:15:42,146
so if someone else created a
real escape string function I

1753
01:15:42,326 --> 01:15:44,966
mean it's like a bad joke that's
become popular convention,

1754
01:15:44,966 --> 01:15:47,936
so anyhow, we'll do
better that this next time

1755
01:15:47,936 --> 01:15:50,536
when we introduce another API
for database connectivity,

1756
01:15:50,746 --> 01:15:53,886
but the point for today is that
super important to do this,

1757
01:15:53,886 --> 01:15:56,936
otherwise you put your data at
risk to various attacks among

1758
01:15:56,936 --> 01:15:59,776
which include deletions
or random insertions

1759
01:15:59,896 --> 01:16:03,046
into your tables or theft
outright of like your users

1760
01:16:03,046 --> 01:16:05,066
and passwords or hashes thereof.

1761
01:16:05,516 --> 01:16:07,626
Okay, so all three
of these lines

1762
01:16:07,626 --> 01:16:10,076
for the moment are equivalent
but the third one is the best

1763
01:16:10,646 --> 01:16:14,276
because it has this additional
security check, so let's go back

1764
01:16:14,276 --> 01:16:17,086
to just the original
and delete the other two

1765
01:16:17,086 --> 01:16:19,576
that I've prepared here and
let's see what I now do.

1766
01:16:19,826 --> 01:16:23,546
Here, is how you
execute a command in PHP

1767
01:16:23,546 --> 01:16:26,986
against this MySQL server, so
when I use the GUI a bit ago,

1768
01:16:27,106 --> 01:16:29,886
PHPMyadmin which again is
just an administrative tool

1769
01:16:29,886 --> 01:16:32,406
for poking around
the database, has no,

1770
01:16:32,716 --> 01:16:36,676
it's something you're going
use in writing your Project One

1771
01:16:36,676 --> 01:16:37,926
or projects in general.

1772
01:16:38,216 --> 01:16:40,286
MySQL query is the
equivalent in PHP

1773
01:16:40,356 --> 01:16:42,846
of my having clicked the submit
button a few minutes ago,

1774
01:16:43,186 --> 01:16:44,246
so what does it return?

1775
01:16:44,366 --> 01:16:46,536
So, it returns what's called
a result set and you can think

1776
01:16:46,536 --> 01:16:49,736
of a result set as a
collection of rows.

1777
01:16:49,736 --> 01:16:52,576
It's a temporary table that
is the result of executing

1778
01:16:52,576 --> 01:16:54,246
that query which in
this case is select.

1779
01:16:54,376 --> 01:16:57,176
So, the temporary table
I'm going to back is going

1780
01:16:58,006 --> 01:17:02,326
to have how many columns based
on the orange query up there?

1781
01:17:02,856 --> 01:17:03,186
Axel.

1782
01:17:03,926 --> 01:17:04,636
>> How many columns?

1783
01:17:04,636 --> 01:17:06,396
>> How many columns
will this query?

1784
01:17:06,736 --> 01:17:06,916
Sorry?

1785
01:17:07,756 --> 01:17:08,226
>> Two columns.

1786
01:17:08,226 --> 01:17:09,066
>> Two, and what are they?

1787
01:17:09,536 --> 01:17:10,306
>> Username, password.

1788
01:17:10,366 --> 01:17:13,406
>> Good. So, even though I use
star the table itself recalled

1789
01:17:13,406 --> 01:17:15,156
that we made only
has two columns

1790
01:17:15,366 --> 01:17:18,086
so the temporary table I'm
going to get back by selecting

1791
01:17:18,086 --> 01:17:20,226
from that table also
has two columns.

1792
01:17:20,836 --> 01:17:22,606
Now, how many rows
is it going to have?

1793
01:17:22,636 --> 01:17:24,286
Well, that depends on
what the user typed in.

1794
01:17:24,466 --> 01:17:26,876
Suppose the user did type
in jharvard and crimson,

1795
01:17:26,876 --> 01:17:28,696
how many rows obviously
should the table,

1796
01:17:28,856 --> 01:17:29,766
the temporary table have?

1797
01:17:31,056 --> 01:17:31,616
Just the one.

1798
01:17:31,856 --> 01:17:34,506
Now, in this case it's
coincidence the temporary table

1799
01:17:34,506 --> 01:17:36,246
is identical to the
actual table,

1800
01:17:36,456 --> 01:17:38,486
but now assume we're
implementing Facebook

1801
01:17:38,486 --> 01:17:42,266
and we have 500 million users,
only one of whom is John Harvard

1802
01:17:42,436 --> 01:17:44,986
so then our temporary table is
obviously much, much smaller

1803
01:17:44,986 --> 01:17:46,886
than the actual table,
so then it make sense

1804
01:17:46,886 --> 01:17:48,436
that would be doing
this selection.

1805
01:17:48,756 --> 01:17:51,316
Alright, so if result
equal, equals, equals false

1806
01:17:51,716 --> 01:17:53,306
that just means something
bad went wrong;

1807
01:17:53,306 --> 01:17:54,456
how do I know to check for that?

1808
01:17:54,456 --> 01:17:57,866
The documentation for the MySQL
query function on PHP.net says

1809
01:17:57,866 --> 01:17:59,726
"upon error this
function returns false"

1810
01:17:59,946 --> 01:18:01,696
so that's how I knew
to check for that.

1811
01:18:01,696 --> 01:18:04,976
I'm using the identity operator
again to be super correct.

1812
01:18:05,926 --> 01:18:09,266
Die sort of an unfortunately
named function, but it's exit

1813
01:18:09,266 --> 01:18:11,116
with a non-zero exit
code in this case

1814
01:18:11,116 --> 01:18:12,886
so this just means
something really went wrong.

1815
01:18:13,256 --> 01:18:16,286
You should not typically
die in production code.

1816
01:18:16,286 --> 01:18:17,996
So, if you're implementing
a real website

1817
01:18:17,996 --> 01:18:21,496
for real users selling stuff,
getting popularity and so forth,

1818
01:18:21,846 --> 01:18:24,276
you should not die, you should
instead return some kind

1819
01:18:24,276 --> 01:18:26,696
of error message to the user
but given that we're just trying

1820
01:18:26,696 --> 01:18:29,476
to demonstrate MySQL here and
we don't want to get into views

1821
01:18:29,476 --> 01:18:33,366
and MVC again all of that, it's
reasonable here to just say die

1822
01:18:33,426 --> 01:18:34,706
with and informative
error message.

1823
01:18:34,706 --> 01:18:36,846
The user will see it, but it
won't be very user friendly,

1824
01:18:36,846 --> 01:18:37,726
it's going to be text.

1825
01:18:38,166 --> 01:18:40,666
So, now we have to do
something with this result set.

1826
01:18:40,806 --> 01:18:44,876
So, just like XPath the function
in PHP returns a node set

1827
01:18:45,346 --> 01:18:48,576
which is a collection of
nodes or really an array

1828
01:18:48,576 --> 01:18:52,006
of nodes similarly is a
result set of collection

1829
01:18:52,006 --> 01:18:55,406
or really an array of rows
from a temporary table.

1830
01:18:55,696 --> 01:18:58,366
So, I can count those
rows with MySQL num rows

1831
01:18:58,586 --> 01:19:01,236
and if MySQL num rows when past

1832
01:19:01,466 --> 01:19:03,416
that result set returns
the number one

1833
01:19:03,636 --> 01:19:04,936
that means there's
one row in it.

1834
01:19:05,576 --> 01:19:08,366
So, why, what do I thing do?

1835
01:19:08,626 --> 01:19:10,636
If there is in fact one row

1836
01:19:10,636 --> 01:19:15,266
that I got back I call this
line, MySQL fetch assoc.

1837
01:19:15,676 --> 01:19:16,816
There's a few different versions

1838
01:19:16,816 --> 01:19:18,806
of this you can call
MySQL fetch array,

1839
01:19:18,986 --> 01:19:20,766
you can call MySQL
fetch objects,

1840
01:19:20,766 --> 01:19:22,306
you can call MySQL fetch assoc;

1841
01:19:22,616 --> 01:19:25,646
in this case MySQL fetch
assoc is associative array

1842
01:19:26,016 --> 01:19:27,226
so it's nice because it's going

1843
01:19:27,226 --> 01:19:29,716
to give me back an associative
array of key value pairs

1844
01:19:29,866 --> 01:19:32,316
where the keys are,
take a guess.

1845
01:19:32,766 --> 01:19:36,016
>> Username, password.

1846
01:19:36,016 --> 01:19:36,436
>> Correct.

1847
01:19:36,436 --> 01:19:38,936
The column names more
generally username and password

1848
01:19:38,936 --> 01:19:41,976
in this case and the values are
what's actually in the cells.

1849
01:19:42,456 --> 01:19:42,536
Yeah.

1850
01:19:42,996 --> 01:19:49,036
>> One thing, instead of doing
the MySQL count rows thing.

1851
01:19:49,036 --> 01:19:49,396
>> Ah hah.

1852
01:19:49,396 --> 01:19:51,936
>> Can't you just as
a SQL do limit one?

1853
01:19:52,216 --> 01:19:55,146
>> Yes, I could do that.

1854
01:19:55,616 --> 01:19:59,716
I could add a limit one which is
another piece of SEQUEL syntax

1855
01:20:00,196 --> 01:20:04,416
up here, however, it could still
return zero, so I need to check

1856
01:20:04,416 --> 01:20:06,406
for something and
so that's correct.

1857
01:20:06,786 --> 01:20:08,546
If my database is correct though

1858
01:20:08,546 --> 01:20:10,476
and I add something
called unique index

1859
01:20:10,476 --> 01:20:13,766
which we'll come back to, that
should hopefully be unnecessary

1860
01:20:13,826 --> 01:20:16,166
but it's still good practice,
still a good addition

1861
01:20:16,166 --> 01:20:17,246
so I'll leave it in there.

1862
01:20:17,796 --> 01:20:21,536
Alright, so now how about
below this fetch line.

1863
01:20:21,846 --> 01:20:23,506
So, what do I have
do at this point?

1864
01:20:23,666 --> 01:20:28,546
At this point in the story I
have presumably, at this point

1865
01:20:28,546 --> 01:20:32,126
in the story I have selected
John Harvard from the database,

1866
01:20:32,226 --> 01:20:33,406
but I haven't checked what?

1867
01:20:36,916 --> 01:20:39,216
I only selected based
on username, so I have.

1868
01:20:39,256 --> 01:20:40,566
>> Have to check password.

1869
01:20:40,566 --> 01:20:42,646
>> Right, so I still need to
do this additional step just

1870
01:20:42,646 --> 01:20:45,906
because jharvard exists does not
mean this person trying to log

1871
01:20:45,906 --> 01:20:48,026
in is jharvard until we
actually check the password.

1872
01:20:48,286 --> 01:20:49,016
So, let's not do this.

1873
01:20:49,366 --> 01:20:50,066
It's pretty trivial.

1874
01:20:50,386 --> 01:20:52,186
So, after I've called
MySQL fetch assoc,

1875
01:20:52,436 --> 01:20:53,786
I have an associative array

1876
01:20:53,886 --> 01:20:56,936
for that row whose keys are the
columns whose values are the

1877
01:20:57,016 --> 01:20:58,416
cells in that row.

1878
01:20:58,746 --> 01:21:01,296
So, if row "pass" whoops I got

1879
01:21:01,296 --> 01:21:04,576
to change this cause I chose
different names before.

1880
01:21:05,096 --> 01:21:11,146
So, if row "password" equals
post "password" actually it pass

1881
01:21:11,146 --> 01:21:12,106
in the HTML so I'll leave

1882
01:21:12,106 --> 01:21:14,316
that be even though I'm being
slightly inconsistent here.

1883
01:21:15,336 --> 01:21:16,326
What do I want to do?

1884
01:21:16,516 --> 01:21:19,236
Remember that the user's logged
in and then all of these lines

1885
01:21:19,236 --> 01:21:21,776
down here is just the cryptic
stuff that we used last time

1886
01:21:21,776 --> 01:21:25,146
to do the redirection of the
URLs to redirect the user back

1887
01:21:25,146 --> 01:21:27,436
to home.PHP and that's it.

1888
01:21:27,436 --> 01:21:31,346
If they are not logged in notice
that we have all of this stuff

1889
01:21:31,346 --> 01:21:34,966
down here which is
just a simple form

1890
01:21:35,026 --> 01:21:36,976
that we've seen a
couple lectures ago.

1891
01:21:37,386 --> 01:21:39,656
So, let's see this in action
and see if we got it all right,

1892
01:21:39,746 --> 01:21:44,436
so let me go back to my browser,
and let me reload log in five,

1893
01:21:45,006 --> 01:21:46,956
good we got rid of the
orange message, how?

1894
01:21:47,136 --> 01:21:49,266
Well, we're actually connecting
to a database properly,

1895
01:21:49,736 --> 01:21:51,766
so it doesn't mean my, the
rest of my code is correct

1896
01:21:51,766 --> 01:21:54,406
so let's try this,
jharvard with no password,

1897
01:21:54,536 --> 01:21:56,306
let's see if I can sneak
my way into the site.

1898
01:21:56,686 --> 01:21:59,206
Log in, could not
query database,

1899
01:21:59,256 --> 01:22:00,746
that actually does
not sound right,

1900
01:22:00,816 --> 01:22:02,726
so what could have
gone wrong there?

1901
01:22:03,046 --> 01:22:06,396
Let's go back here
and what am I missing?

1902
01:22:06,396 --> 01:22:11,226
Jharvard crimson want to
connect to jharvard lecture

1903
01:22:11,636 --> 01:22:13,406
and it's saying could
not connect

1904
01:22:13,746 --> 01:22:17,496
to database, why is that?

1905
01:22:18,206 --> 01:22:23,826
Jharvard crimson could not
query database that's different

1906
01:22:23,826 --> 01:22:26,996
message, sorry so
we are connecting,

1907
01:22:26,996 --> 01:22:30,926
sprintf MySQL select
database users where,

1908
01:22:30,976 --> 01:22:33,456
oh okay so this is failing

1909
01:22:34,386 --> 01:22:37,376
and this is just my unfortunate
choice of names earlier.

1910
01:22:37,666 --> 01:22:38,826
What's the field
actually called?

1911
01:22:39,466 --> 01:22:39,716
>> Username.

1912
01:22:40,126 --> 01:22:41,576
>> Good. Alright, so
I made the mistake

1913
01:22:41,576 --> 01:22:43,846
when I prefabbed the code
I called it user instead

1914
01:22:43,846 --> 01:22:46,586
of username so let's see if that
now solves our problem there.

1915
01:22:47,246 --> 01:22:49,506
Reload, we're going to
resubmit the same form

1916
01:22:49,506 --> 01:22:54,156
with just jharvard no password,
hum, I ended up back here, why?

1917
01:22:54,156 --> 01:22:54,226
Yeah.

1918
01:22:54,226 --> 01:23:00,066
>> Because you were not
allowed to log in so you were.

1919
01:23:00,066 --> 01:23:02,326
>> Exactly, these are super
simple examples the fact

1920
01:23:02,526 --> 01:23:05,606
that I've been, I've
seen the form again means

1921
01:23:05,606 --> 01:23:06,456
that I got it wrong.

1922
01:23:06,456 --> 01:23:10,086
So, let's try jharvard and some
random sequence of characters,

1923
01:23:10,116 --> 01:23:15,546
log in, still not working,
jharvard and crimson log in,

1924
01:23:16,246 --> 01:23:18,826
nice and so now I'm logged in.

1925
01:23:18,866 --> 01:23:21,576
I can log out and now I'm
logged out, back at home,

1926
01:23:21,906 --> 01:23:23,556
I can try logging again.

1927
01:23:23,556 --> 01:23:25,016
So, let's prove that
this isn't faked.

1928
01:23:25,456 --> 01:23:27,876
Let's go to where is it?

1929
01:23:27,876 --> 01:23:32,246
Let's go to the appliance
PHPMyadmin and let's go

1930
01:23:32,246 --> 01:23:35,646
to our user's table,
zoom out a bit here,

1931
01:23:36,216 --> 01:23:40,126
let's edit jharvard crimson and
let's see the update command,

1932
01:23:40,416 --> 01:23:44,686
so I'm going to change John
Harvard's password to password

1933
01:23:44,686 --> 01:23:48,366
and then I'm click go,
and notice now the SEQUEL

1934
01:23:48,366 --> 01:23:50,996
that was executed, here's how
you update a field, update,

1935
01:23:51,636 --> 01:23:55,946
users, set, password
equals "password" alright

1936
01:23:55,946 --> 01:23:58,626
so slightly confusing but
this is the column name,

1937
01:23:58,866 --> 01:24:03,386
this is the value, where
user name equals "jharvard"

1938
01:24:03,616 --> 01:24:12,496
and in this case, whoops, and
let's see, my cache expired,

1939
01:24:12,856 --> 01:24:15,696
let me do this again, edit,
we're going to change it 1, 2,

1940
01:24:15,696 --> 01:24:17,236
3, 4, 5 so it's more explicit.

1941
01:24:18,066 --> 01:24:23,626
So, and password equals
password, so it is trying

1942
01:24:23,626 --> 01:24:26,076
to change John Harvard's
password to 1, 2, 3, 4,

1943
01:24:26,076 --> 01:24:28,966
5 for any row for
which it is the case

1944
01:24:29,446 --> 01:24:30,956
that username equals jharvard

1945
01:24:30,956 --> 01:24:32,566
and password equals
1, 2, 3, 4, 5.

1946
01:24:32,826 --> 01:24:35,576
So, that's one way of uniquely
identifying this row hopefully

1947
01:24:35,886 --> 01:24:38,396
unless we screwed up and we
allowed two John Harvards

1948
01:24:38,396 --> 01:24:40,616
who coincidentally have the
same password in the database

1949
01:24:40,866 --> 01:24:42,616
in which case both of
their passwords are going

1950
01:24:42,616 --> 01:24:44,026
to be changed which
just is wrong,

1951
01:24:44,026 --> 01:24:45,216
so we'll fix that shortly.

1952
01:24:46,326 --> 01:24:48,796
Alright, so what
else can we do here?

1953
01:24:48,796 --> 01:24:50,316
How can we improve upon this?

1954
01:24:50,316 --> 01:24:52,986
Well, let me go back into the
code here and let me go ahead

1955
01:24:52,986 --> 01:24:56,186
and open up log in
6.PHP instead,

1956
01:24:56,536 --> 01:25:01,426
so when log in 6.PHP notice that
we have a couple of changes here

1957
01:25:03,896 --> 01:25:09,336
and what different
if we focus on this?

1958
01:25:09,536 --> 01:25:10,746
Much of the code is the same,

1959
01:25:10,746 --> 01:25:12,436
but what have I fundamentally
done that's different?

1960
01:25:12,486 --> 01:25:12,866
Yeah, Axel.

1961
01:25:12,866 --> 01:25:19,166
>> Well, in the SQL you
queried for the username

1962
01:25:19,166 --> 01:25:22,536
and the password
at the same time.

1963
01:25:22,636 --> 01:25:24,126
>> Good, so in MySQL query

1964
01:25:24,126 --> 01:25:26,486
in this middle thing here
select one from users

1965
01:25:26,486 --> 01:25:29,866
where user equals %s
and pass equals %s,

1966
01:25:30,046 --> 01:25:32,796
I need to fix this again and
change user to username and pass

1967
01:25:32,846 --> 01:25:36,426
to password, but that's not
a big deal so that fixes

1968
01:25:36,426 --> 01:25:38,096
that same issue again.

1969
01:25:38,486 --> 01:25:41,496
So, selecting one why
am I selecting one?

1970
01:25:42,026 --> 01:25:43,276
Anyone want to hazard a guess

1971
01:25:43,276 --> 01:25:45,956
at why I could take this
alternative approach,

1972
01:25:45,956 --> 01:25:47,476
what's an upside,
what's a downside?

1973
01:25:48,056 --> 01:25:55,806
If any? What do you
like about it?

1974
01:25:56,516 --> 01:26:03,636
[ Pause ]

1975
01:26:04,136 --> 01:26:10,156
Nothing? Okay, what do
you dislike about it?

1976
01:26:10,386 --> 01:26:12,126
Alright, no one cares about it.

1977
01:26:13,026 --> 01:26:16,486
So, take a guess like think
back to the previous example

1978
01:26:16,486 --> 01:26:19,566
where I only selected based on
jharvard and I did some stuff.

1979
01:26:20,186 --> 01:26:22,826
This time I'm selecting
based on jharvard and crimson

1980
01:26:23,396 --> 01:26:25,816
and I'm doing less stuff, right?

1981
01:26:25,966 --> 01:26:28,926
What do I not have to do
in the case of this query?

1982
01:26:29,516 --> 01:26:32,816
[ Inaudible Response ]

1983
01:26:33,316 --> 01:26:35,216
Yeah, I'm just saving
myself some code, right?

1984
01:26:35,216 --> 01:26:38,606
And this speaks to the
beginnings of the power

1985
01:26:38,606 --> 01:26:40,046
of an actual database server.

1986
01:26:40,226 --> 01:26:42,366
You have the ability
do filtration

1987
01:26:42,366 --> 01:26:43,986
and selection and conditionals.

1988
01:26:44,256 --> 01:26:47,706
You don't need to reimplement
that wheel in the code,

1989
01:26:47,706 --> 01:26:50,136
in PHP code or any
language for that matter,

1990
01:26:50,396 --> 01:26:52,346
so if you have the
ability to do Booleans ANDs

1991
01:26:52,346 --> 01:26:54,566
and you do here apparent
in SEQUEL with capital

1992
01:26:54,566 --> 01:26:57,296
AND in this case, well
just punt to the database,

1993
01:26:57,296 --> 01:26:59,496
let the database figure
out if John Harvard exists

1994
01:26:59,496 --> 01:27:01,506
and if this is indeed his
password cause it means I can

1995
01:27:01,506 --> 01:27:04,626
write less code and
frankly, I'm going to assume

1996
01:27:04,626 --> 01:27:08,306
that some professional database
people in MySQL are smarter

1997
01:27:08,306 --> 01:27:10,706
than I am when it comes to
optimizing certain types

1998
01:27:10,706 --> 01:27:12,666
of queries like comparisons
like this.

1999
01:27:13,056 --> 01:27:14,956
Now, granted I still have
to do the thought upfront

2000
01:27:14,956 --> 01:27:17,776
as to how long the field should
be, what should the datatypes be

2001
01:27:17,776 --> 01:27:20,396
to kind of help the database
be highly performing,

2002
01:27:20,786 --> 01:27:23,036
but once I do those
initial hints

2003
01:27:23,036 --> 01:27:26,196
when configuring the schema
frankly the database should be

2004
01:27:26,196 --> 01:27:29,206
able to do this pretty fast
if It's doing its job well,

2005
01:27:29,466 --> 01:27:32,056
so in the case I'm
still escaping both just

2006
01:27:32,056 --> 01:27:34,696
so that there's no danger
and I'm only selecting one

2007
01:27:34,696 --> 01:27:36,386
as a slight modification here.

2008
01:27:36,686 --> 01:27:38,356
Typically, what did
we say is returned

2009
01:27:38,356 --> 01:27:39,636
when you do a select statement?

2010
01:27:40,496 --> 01:27:42,966
You get back a temporary
table inside of which is what?

2011
01:27:42,966 --> 01:27:47,156
>> The things you asked for.

2012
01:27:47,396 --> 01:27:48,336
>> The things I asked for.

2013
01:27:48,526 --> 01:27:50,716
So, technically I'm not
really asking for much now.

2014
01:27:51,016 --> 01:27:53,806
All I want is the number
one back from the database,

2015
01:27:53,806 --> 01:27:56,596
so this is a minor
performance improvement

2016
01:27:56,656 --> 01:27:59,096
for at least a small database
here, but if you think

2017
01:27:59,096 --> 01:28:02,626
about in general, if you
were returning a whole,

2018
01:28:02,626 --> 01:28:05,306
if you were selecting
star and we had more rows,

2019
01:28:05,306 --> 01:28:07,446
more columns rather than
just username and password

2020
01:28:07,446 --> 01:28:10,806
and we had phone number and
email address and, you know,

2021
01:28:10,806 --> 01:28:13,596
GChat ID and all
these various pieces

2022
01:28:13,596 --> 01:28:15,366
of data you might have
associated with the user,

2023
01:28:15,646 --> 01:28:17,906
why in the hell do I need to
select all of those pieces

2024
01:28:17,906 --> 01:28:20,286
of data when all I want is
an answer to the question,

2025
01:28:20,526 --> 01:28:21,746
is your password correct?

2026
01:28:22,266 --> 01:28:25,336
So, the way to return that
answer as a efficiently

2027
01:28:25,336 --> 01:28:28,556
as possible is just give
me back a temporary table

2028
01:28:28,556 --> 01:28:31,186
with one column and
one row inside

2029
01:28:31,186 --> 01:28:35,966
of which is literally the number
one if my query is, so to speak,

2030
01:28:36,646 --> 01:28:38,466
correct otherwise
I get back no rows

2031
01:28:38,466 --> 01:28:41,046
in which case it's clearly
not his username and password.

2032
01:28:41,276 --> 01:28:42,946
So, a minor performance
improvement

2033
01:28:42,946 --> 01:28:45,896
for again small datasets,
but the fundamental idea is

2034
01:28:45,896 --> 01:28:49,176
that don't select more data than
you actually need, and indeed,

2035
01:28:49,176 --> 01:28:52,266
previously when I did select
star, I was being kind of lazy.

2036
01:28:52,456 --> 01:28:54,766
I needed to get back
John Harvard's password,

2037
01:28:54,926 --> 01:28:56,906
but did I need to get back
John Harvard's username?

2038
01:28:56,906 --> 01:29:00,916
I mean, no I already had it,
right, I gave it to the database

2039
01:29:00,916 --> 01:29:03,526
so that too was just lazy
on my part, so in general,

2040
01:29:03,526 --> 01:29:05,626
avoid using star cause
you're just going

2041
01:29:05,626 --> 01:29:08,566
to waste time transferring
information from the database

2042
01:29:08,846 --> 01:29:11,656
to the web server, to
your PHP code to access

2043
01:29:11,656 --> 01:29:12,566
and if you don't care

2044
01:29:12,566 --> 01:29:14,976
about those fields you should
instead enumerate the ones you

2045
01:29:14,976 --> 01:29:18,336
do care about one by one
unless you want them all back.

2046
01:29:19,636 --> 01:29:22,466
Alright, so the rest of
this code is then thereafter

2047
01:29:22,466 --> 01:29:22,986
the same.

2048
01:29:22,986 --> 01:29:26,176
So, let's see another
variance of this that allows us

2049
01:29:26,176 --> 01:29:28,266
to take it in a slightly
different direction.

2050
01:29:28,676 --> 01:29:31,846
So, in this case, we're
solving the problem

2051
01:29:31,846 --> 01:29:33,296
that Axel referred to earlier.

2052
01:29:33,726 --> 01:29:37,466
So, this design we started
with it's kind of stupid.

2053
01:29:37,466 --> 01:29:42,266
It's not incorrect, it's just
not very good for security

2054
01:29:42,266 --> 01:29:44,536
and by that I mean my
usernames are in clear text

2055
01:29:44,536 --> 01:29:46,636
and that's okay that
kind of is necessary,

2056
01:29:47,016 --> 01:29:48,766
but my passwords are
also in clear text

2057
01:29:48,766 --> 01:29:51,296
and by clear text I mean if John
Harvard's password is crimson,

2058
01:29:51,596 --> 01:29:52,636
what do we see in the database?

2059
01:29:52,996 --> 01:29:55,556
Crimson, what does that mean?

2060
01:29:55,706 --> 01:29:56,996
It means if someone's looking

2061
01:29:56,996 --> 01:29:58,916
over the database
administrator's shoulder he

2062
01:29:58,916 --> 01:30:00,536
or she sees that same password,

2063
01:30:00,746 --> 01:30:03,806
it means if the database
is somehow compromised

2064
01:30:03,806 --> 01:30:06,966
and a bad guy physically walks
up to the computer and copies it

2065
01:30:07,186 --> 01:30:09,766
or somehow someone on the
Internet steals that database

2066
01:30:09,916 --> 01:30:12,726
or someone on the Internet
executes a SEQUEL injection

2067
01:30:12,726 --> 01:30:15,126
attack inside of which
is a select statement

2068
01:30:15,126 --> 01:30:17,406
so now some random kid on
the Internet has selected all

2069
01:30:17,406 --> 01:30:19,726
of the users and passwords
on my database, in short,

2070
01:30:20,106 --> 01:30:23,766
not so good if by losing the
database table you're also

2071
01:30:23,766 --> 01:30:25,806
losing your user's passwords.

2072
01:30:26,546 --> 01:30:27,716
So, what's an alternative?

2073
01:30:28,056 --> 01:30:29,536
So, you proposed
what before Axel?

2074
01:30:30,046 --> 01:30:33,356
>> You can one way
encrypt the password.

2075
01:30:33,656 --> 01:30:35,846
>> Okay, what does it mean to
one way encrypt the passwords?

2076
01:30:35,906 --> 01:30:38,766
>> Well, it means to scramble
the letters and numbers in a way

2077
01:30:38,976 --> 01:30:41,766
that they can't engineer
it backwards.

2078
01:30:41,836 --> 01:30:44,906
>> Good. So, a one way hash as
it's generally called is kind

2079
01:30:44,906 --> 01:30:47,646
of like encryption but it's
encryption in one direction

2080
01:30:47,986 --> 01:30:50,936
and it's encryption in the sense
that it does scramble the input,

2081
01:30:50,936 --> 01:30:53,966
so your clear text becomes
some kind of cypher text,

2082
01:30:54,216 --> 01:30:55,836
but it's one way in the sense

2083
01:30:55,836 --> 01:30:57,326
that you can't undo
those effects,

2084
01:30:57,896 --> 01:31:02,526
so if my password is crimson
for crimson what's stored

2085
01:31:02,526 --> 01:31:05,916
in the database is some crazy
looking sequence of characters,

2086
01:31:05,976 --> 01:31:07,466
it's not going to be crimson.

2087
01:31:07,966 --> 01:31:09,436
The only problem with that is

2088
01:31:09,436 --> 01:31:12,476
that no one now can see
what my password is.

2089
01:31:12,806 --> 01:31:15,996
So it feels like this is not
the right solution, right?

2090
01:31:15,996 --> 01:31:17,436
If I am not storing crimson

2091
01:31:17,436 --> 01:31:20,786
in the database I'm instead
storing a hash of it,

2092
01:31:20,966 --> 01:31:25,996
a mutation a scrambling of it
somehow and yet me the human,

2093
01:31:26,106 --> 01:31:28,766
knows I know what
my password is.

2094
01:31:28,766 --> 01:31:30,806
I don't know what that
random series of text is,

2095
01:31:30,916 --> 01:31:32,216
only the database does.

2096
01:31:32,646 --> 01:31:34,036
How do I subsequently log

2097
01:31:34,036 --> 01:31:36,196
in after my password
has been stored

2098
01:31:36,256 --> 01:31:38,006
in this scrambled fashion?

2099
01:31:38,686 --> 01:31:38,776
Yeah.

2100
01:31:41,496 --> 01:31:44,786
>> Encrypt it and [inaudible].

2101
01:31:45,046 --> 01:31:46,806
>> Yeah, exactly so
there is a solution here.

2102
01:31:46,806 --> 01:31:49,856
So, when I create my account,
when I register for an account,

2103
01:31:50,096 --> 01:31:53,156
they don't store the
database crimson any more,

2104
01:31:53,156 --> 01:31:55,116
they instead run it
through this one way hash

2105
01:31:55,116 --> 01:31:58,476
which is just a function that
returns some random characters,

2106
01:31:58,706 --> 01:32:01,736
then they store that
string in the database.

2107
01:32:02,146 --> 01:32:04,166
So, what does that mean
for subsequent log in's?

2108
01:32:04,166 --> 01:32:06,586
Well, I obviously don't know
what that random string is

2109
01:32:06,586 --> 01:32:08,716
or that seemingly random
string is, but I do know

2110
01:32:08,716 --> 01:32:10,716
that my password's crimson,
so all we have to do is

2111
01:32:10,716 --> 01:32:12,796
that same math, those
same mathematics again.

2112
01:32:12,796 --> 01:32:16,316
The next time you try to log
in, take my password crimson

2113
01:32:16,316 --> 01:32:20,476
which I typed into the post
submission, encrypt that or hash

2114
01:32:20,476 --> 01:32:22,386
that so to speak and
then compare what?

2115
01:32:22,926 --> 01:32:25,616
Compare that result against
the result you stored

2116
01:32:25,616 --> 01:32:28,696
in the database and if they
match then it must be me.

2117
01:32:29,086 --> 01:32:34,406
Now, small white lie it's
possible with various hashes

2118
01:32:34,696 --> 01:32:38,296
that two people's passwords
could hash to the same value,

2119
01:32:38,616 --> 01:32:41,626
so if my password is crimson
and your password is 1, 2, 3, 4,

2120
01:32:41,626 --> 01:32:44,956
5 in theory because of the
way hashing functions work,

2121
01:32:45,186 --> 01:32:47,266
the random sequence of
characters that's stored

2122
01:32:47,266 --> 01:32:49,656
in the database could
actually be identical,

2123
01:32:50,106 --> 01:32:50,996
so what does this mean?

2124
01:32:51,156 --> 01:32:57,086
This actually means that I
could log in with either crimson

2125
01:32:57,086 --> 01:32:59,826
or with 1, 2, 3, 4,
5 the only catch is

2126
01:32:59,826 --> 01:33:03,376
that I have no idea
mathematically what else hash is

2127
01:33:03,376 --> 01:33:06,166
to that same value so the
reality is almost all of us

2128
01:33:06,166 --> 01:33:07,226
who have accounts on sites

2129
01:33:07,226 --> 01:33:09,846
on the Internet odds are you
don't just have one password,

2130
01:33:09,846 --> 01:33:12,316
you could have maybe
two, maybe even three

2131
01:33:12,316 --> 01:33:15,016
or more passwords it's just
you have no idea what they are

2132
01:33:15,126 --> 01:33:16,566
and you could try
to figure it out,

2133
01:33:16,566 --> 01:33:19,566
but if passwords can be 12
characters, 16 characters

2134
01:33:19,796 --> 01:33:21,946
and they can be letters and
numbers and punctuation,

2135
01:33:21,946 --> 01:33:24,826
that's going to take a lot
of time to brute force figure

2136
01:33:24,826 --> 01:33:27,446
out what your other
possible passwords are,

2137
01:33:27,626 --> 01:33:29,526
but this is just the
nature of hashing function.

2138
01:33:29,786 --> 01:33:31,136
So, how are we doing it here?

2139
01:33:31,136 --> 01:33:34,506
In orange here in the middle,
almost the same code but notice

2140
01:33:34,506 --> 01:33:36,456
that I'm practicing
what I'm preaching here.

2141
01:33:36,456 --> 01:33:38,396
I'm comparing not
just the username

2142
01:33:38,396 --> 01:33:42,266
with %s ignoring the
misnamed user and pass field,

2143
01:33:42,696 --> 01:33:44,376
not just comparing
the user fields,

2144
01:33:44,716 --> 01:33:46,466
I'm also comparing
the pass field

2145
01:33:46,466 --> 01:33:48,726
but I'm not comparing the
pass field against %s,

2146
01:33:48,726 --> 01:33:52,346
what am I comparing
the pass field against?

2147
01:33:52,346 --> 01:33:59,396
Capital letters, PASSWORD ("%s")
so it turns out that MySQL

2148
01:33:59,606 --> 01:34:01,586
as a database server can do more

2149
01:34:01,586 --> 01:34:03,226
than just store rows
and columns.

2150
01:34:03,436 --> 01:34:05,966
It can also allow you to
call functions, and in fact,

2151
01:34:05,966 --> 01:34:09,026
databases often support what are
called stored procedures whereby

2152
01:34:09,106 --> 01:34:11,106
you can write your
own custom functions,

2153
01:34:11,366 --> 01:34:14,166
store them in the database and
then call them, just so happens

2154
01:34:14,166 --> 01:34:16,386
that MySQL gives you
one such function

2155
01:34:16,436 --> 01:34:18,226
for free called password

2156
01:34:18,416 --> 01:34:20,416
which does exactly
this one way hashing

2157
01:34:20,476 --> 01:34:21,566
that we've been talking about.

2158
01:34:21,766 --> 01:34:24,796
There are other functions
like average and sum and count

2159
01:34:24,796 --> 01:34:27,526
and various like useful
things for a toolbox to have,

2160
01:34:27,876 --> 01:34:29,916
but this is one relevant
to passwords,

2161
01:34:30,206 --> 01:34:31,176
now, there is a catch.

2162
01:34:31,506 --> 01:34:36,306
This exists, but does anyone
know what the caveat here is?

2163
01:34:36,906 --> 01:34:36,996
Yeah.

2164
01:34:39,556 --> 01:34:40,656
>> Can reverse engineering.

2165
01:34:40,836 --> 01:34:42,646
>> Can reverse engineering.

2166
01:34:42,716 --> 01:34:44,816
It sucks is really
what it boils down to.

2167
01:34:44,816 --> 01:34:47,506
The password function
is the right idea,

2168
01:34:47,506 --> 01:34:49,496
but not very well
implemented, and in fact,

2169
01:34:49,496 --> 01:34:52,706
with relatively little effort
can you brute force the password

2170
01:34:52,706 --> 01:34:55,776
function and figure out
alternatives to those hashes

2171
01:34:55,776 --> 01:34:57,446
so that you can figure
out effectively how

2172
01:34:57,446 --> 01:34:59,086
to log in as someone else.

2173
01:34:59,426 --> 01:35:01,626
So, there's alternatives and
Axel mentioned MD5 before,

2174
01:35:01,896 --> 01:35:08,156
there's SHA-1, there's MD5,
there's SHA-256, SHA-512 and

2175
01:35:08,156 --> 01:35:13,126
yet others, the reality is that
MySQL does not have great built

2176
01:35:13,126 --> 01:35:15,696
in support for this
kind of hashing.

2177
01:35:15,696 --> 01:35:17,936
There's faults in most
everyone of the algorithms

2178
01:35:17,936 --> 01:35:21,026
that it does support out of
the box, so generally even I

2179
01:35:21,066 --> 01:35:24,176
in recent years have taken to
using third party libraries

2180
01:35:24,176 --> 01:35:29,316
and doing this in code,
PHP pass is an alternative,

2181
01:35:29,316 --> 01:35:31,546
it's a freely open
source PHP library

2182
01:35:31,546 --> 01:35:33,436
that actually does the
mathematics for you

2183
01:35:33,706 --> 01:35:36,216
so you call an actual
PHP function and you pass

2184
01:35:36,246 --> 01:35:37,986
that string into
the database rather

2185
01:35:37,986 --> 01:35:39,786
than letting the
database do it yourself,

2186
01:35:40,046 --> 01:35:42,186
arguably this is
better too because then

2187
01:35:42,186 --> 01:35:45,236
if you every do change
from MySQL to Oracle

2188
01:35:45,236 --> 01:35:47,586
or PostgreSQL you're not relying

2189
01:35:47,586 --> 01:35:50,826
on database specific
functionality you're only

2190
01:35:50,826 --> 01:35:53,806
calling the most basic of SEQUEL
functions like average and count

2191
01:35:53,806 --> 01:35:55,326
and summation and so forth,

2192
01:35:55,566 --> 01:35:57,876
so your code is arguably
a little more portable,

2193
01:35:57,876 --> 01:36:01,796
but this is one of those
minor headaches of MySQL is

2194
01:36:01,796 --> 01:36:04,196
that this function
perfectly named though it is,

2195
01:36:04,196 --> 01:36:06,746
is horribly implemented
so just don't use it,

2196
01:36:07,156 --> 01:36:08,806
but also don't store
things in clear text,

2197
01:36:08,806 --> 01:36:10,776
your better bet is
using a library in PHP

2198
01:36:10,776 --> 01:36:12,606
or whatever language
you're writing in

2199
01:36:13,146 --> 01:36:14,596
but the point ultimately
is to hash it.

2200
01:36:15,066 --> 01:36:18,466
Alright, any questions?

2201
01:36:19,096 --> 01:36:22,646
Alright, so one last variance
of this that was seven

2202
01:36:23,206 --> 01:36:25,886
and here is now eight.

2203
01:36:26,376 --> 01:36:34,156
Alright, so in version
eight, no I'm not going

2204
01:36:34,156 --> 01:36:36,806
to show you this one cause
this one's just similarly not

2205
01:36:36,806 --> 01:36:37,276
so great.

2206
01:36:37,546 --> 01:36:41,926
Let's do this instead.

2207
01:36:42,266 --> 01:36:46,446
Alright. So, MySQL,

2208
01:36:46,596 --> 01:36:48,606
unfortunately documentation
not so good.

2209
01:36:48,696 --> 01:36:53,386
It's correct and it's
complete, but it's not nearly

2210
01:36:53,836 --> 01:36:56,756
as user friendly to navigate
to be honest, so when it comes

2211
01:36:56,756 --> 01:36:59,246
to like looking up things
related to MySQL or SEQUEL

2212
01:36:59,246 --> 01:37:00,996
to be honest, I think
you find that Google

2213
01:37:00,996 --> 01:37:03,096
and various free write websites
some of which we've linked

2214
01:37:03,096 --> 01:37:05,016
on the resources page on
the course's homepage,

2215
01:37:05,236 --> 01:37:07,636
and will point you at
as needed in the pdf

2216
01:37:07,736 --> 01:37:10,656
of Project One are
probably better resources

2217
01:37:10,656 --> 01:37:12,556
but if you ever need
an authoritative answer

2218
01:37:12,806 --> 01:37:16,166
about MySQL realize that the
manual this is the place to go

2219
01:37:16,376 --> 01:37:18,406
and this is in fact
useful for things

2220
01:37:18,406 --> 01:37:21,586
like how big is an int field?

2221
01:37:21,586 --> 01:37:23,026
How big is a big int field?

2222
01:37:23,026 --> 01:37:24,226
How big is a date field?

2223
01:37:24,226 --> 01:37:25,846
How big is a text
field and so forth?

2224
01:37:26,146 --> 01:37:28,386
When you want definitive answers
to those kinds of things,

2225
01:37:28,386 --> 01:37:30,226
best not to trust random
people on the Internet,

2226
01:37:30,456 --> 01:37:32,736
but to actually go to the
official documentation much

2227
01:37:32,736 --> 01:37:36,056
like you would for various PHP
functions but we've glossed

2228
01:37:36,056 --> 01:37:39,586
over one important and
compelling detail earlier in one

2229
01:37:39,586 --> 01:37:42,046
of those dropdown
menus in PHPMyadmin,

2230
01:37:42,596 --> 01:37:45,856
name the primary key index
unique and full text.

2231
01:37:46,876 --> 01:37:52,406
So, in our user's table, we
have username and password,

2232
01:37:54,006 --> 01:37:56,796
which of those should
be by nature unique?

2233
01:37:58,746 --> 01:38:03,076
And by unique I mean only one
user should ever have that value

2234
01:38:03,626 --> 01:38:05,146
for username or for password?

2235
01:38:05,186 --> 01:38:05,286
Yeah.

2236
01:38:05,876 --> 01:38:07,636
>> It should be the username.

2237
01:38:08,006 --> 01:38:08,826
>> Okay, username why?

2238
01:38:09,476 --> 01:38:13,856
>> Because people, oh yes
if you only want one person

2239
01:38:13,856 --> 01:38:15,806
with one particular username.

2240
01:38:15,806 --> 01:38:16,166
>> Perfect.

2241
01:38:16,376 --> 01:38:18,836
>> And the passwords, well you
can't really tell people the

2242
01:38:18,896 --> 01:38:21,066
passwords they should have
and then lots of people have

2243
01:38:21,066 --> 01:38:23,076
like [inaudible] 1, 2, 3, 4.

2244
01:38:23,076 --> 01:38:23,666
>> Okay, good.

2245
01:38:23,666 --> 01:38:26,106
So, username it's
reasonable to expect

2246
01:38:26,106 --> 01:38:28,576
that really should be unique
otherwise consider the

2247
01:38:28,576 --> 01:38:29,426
alternative, right?

2248
01:38:29,426 --> 01:38:32,586
Consider on Facebook if
you could log in as jhavard

2249
01:38:32,786 --> 01:38:33,776
but so could someone else,

2250
01:38:33,856 --> 01:38:35,566
like whose profile
are you going to see?

2251
01:38:35,566 --> 01:38:37,716
So, like that clearly
needs to be unique.

2252
01:38:37,986 --> 01:38:40,706
Now, password doesn't
need to be unique, right?

2253
01:38:40,706 --> 01:38:42,836
Because it'd be nice if we
could have the same password,

2254
01:38:42,836 --> 01:38:45,276
it'd be nice if it's not
something silly like 1, 2, 3, 4,

2255
01:38:45,276 --> 01:38:47,866
5, but by chance you have
500 million plus users

2256
01:38:47,866 --> 01:38:50,256
and two people are going to have
the same password, so to impose

2257
01:38:50,256 --> 01:38:52,356
that constraint seems
a little foolish,

2258
01:38:52,356 --> 01:38:53,756
plus if you really
think about it,

2259
01:38:54,066 --> 01:38:56,546
well if you only had two users
in the system and you try

2260
01:38:56,546 --> 01:38:57,636
to choose a password
and you're told

2261
01:38:57,636 --> 01:38:59,526
that someone already has it, you
know the other guy's password,

2262
01:38:59,626 --> 01:39:00,706
right, but that's a corner case.

2263
01:39:01,056 --> 01:39:05,206
Alright, so in terms of
enforcing this, what can you do?

2264
01:39:05,376 --> 01:39:08,376
Well, if you're writing
code that registers users,

2265
01:39:08,376 --> 01:39:11,546
signs them up for your website
you could just do it in PHP,

2266
01:39:11,616 --> 01:39:13,816
right, you select; if a
user wants to register

2267
01:39:13,816 --> 01:39:15,836
as John Harvard with
username jharvard,

2268
01:39:16,056 --> 01:39:18,496
you first check does jharvard
exist and how do you do that?

2269
01:39:18,496 --> 01:39:20,876
You do a SEQUEL select and
then if it doesn't exist,

2270
01:39:20,876 --> 01:39:22,986
you get back zero rows
then you do your insert,

2271
01:39:23,446 --> 01:39:25,886
if instead you do see a
jharvard already you just yell

2272
01:39:25,886 --> 01:39:27,386
at the user and you say
pick another user name.

2273
01:39:27,946 --> 01:39:30,306
Now, there is a problem
here that we'll come back

2274
01:39:30,306 --> 01:39:34,446
to next week whereby what if two
John Harvard's are sitting next

2275
01:39:34,446 --> 01:39:37,006
to each other or sitting
in Starbucks across,

2276
01:39:37,006 --> 01:39:38,446
halfway across the
world from each other,

2277
01:39:38,806 --> 01:39:41,836
but at roughly the same time
they both try to register

2278
01:39:41,836 --> 01:39:45,786
for the site, so they type in
jharvard and then both hit enter

2279
01:39:45,896 --> 01:39:49,456
at roughly the same time
their post requests go

2280
01:39:49,456 --> 01:39:52,526
to the webserver, the webserver
processes them maybe even

2281
01:39:52,526 --> 01:39:54,786
in parallel alright this
is a multicore computer,

2282
01:39:54,786 --> 01:39:57,486
multiple CPUs it has multiple
threads therefore literally

2283
01:39:57,486 --> 01:40:00,306
stuff can happen at that
the same time and even

2284
01:40:00,306 --> 01:40:02,176
if it can suppose one
of them gets in ever

2285
01:40:02,176 --> 01:40:04,336
so slightly before the
other what might happen?

2286
01:40:04,566 --> 01:40:08,216
Well, both of these guy's
posts might be handled

2287
01:40:08,216 --> 01:40:09,106
by the webserver.

2288
01:40:09,296 --> 01:40:12,496
The webserver is going to treat
them similarly and it's going

2289
01:40:12,496 --> 01:40:15,046
to do a select from the database
to see if jharvard exists,

2290
01:40:15,566 --> 01:40:19,506
but then think about how
computers typically work.

2291
01:40:19,686 --> 01:40:22,126
You don't get 100% of
the CPU's attention

2292
01:40:22,326 --> 01:40:24,876
in multitasks among all
of the various threads

2293
01:40:24,876 --> 01:40:26,576
or things going inside
of the computer,

2294
01:40:26,796 --> 01:40:28,496
so right after your
select statement

2295
01:40:28,496 --> 01:40:31,666
for jharvard this thread
so to speak might be put

2296
01:40:31,666 --> 01:40:33,336
to sleep briefly, split second,

2297
01:40:33,686 --> 01:40:35,586
but that means this
guy might be woken up

2298
01:40:35,706 --> 01:40:37,726
and his select statement
gets checked.

2299
01:40:38,176 --> 01:40:43,076
The answer now to both of these
guys is jharvard does not exist.

2300
01:40:43,796 --> 01:40:45,726
So, now your second
line of code executes

2301
01:40:45,726 --> 01:40:48,966
and you do the inserts then
this guy gets put to sleep just

2302
01:40:48,966 --> 01:40:50,956
because of the operating
system is behaving.

2303
01:40:51,156 --> 01:40:53,346
This guy then tries to do
the insert, what happens?

2304
01:40:54,626 --> 01:40:56,386
It breaks somehow, right?

2305
01:40:56,386 --> 01:40:58,286
Either you're trying to
insert another jharvard

2306
01:40:58,286 --> 01:41:00,346
and worse case you now
have two jharvards which is

2307
01:41:00,346 --> 01:41:01,296
like the Facebook problem,

2308
01:41:01,296 --> 01:41:03,436
now which is the right
jharvard whose profile to show

2309
01:41:03,436 --> 01:41:05,796
and so forth, or
there's a failure case

2310
01:41:05,846 --> 01:41:09,006
or somehow this guy's password
is now the same for both people,

2311
01:41:09,196 --> 01:41:11,316
in short, this is
not a good situation.

2312
01:41:11,526 --> 01:41:13,426
So, we'll solve that problem
next time with this notion

2313
01:41:13,426 --> 01:41:17,146
of locks or transactions,
but for now this is kind

2314
01:41:17,146 --> 01:41:20,026
of a bad situation but this
where two databases are

2315
01:41:20,026 --> 01:41:22,706
so much more powerful than
XML files and CSV files

2316
01:41:22,916 --> 01:41:24,506
where the [inaudible]
is otherwise on you,

2317
01:41:24,796 --> 01:41:28,546
here we can tell the database
make the username field unique

2318
01:41:28,546 --> 01:41:31,696
so that even if I screw
up, I being the developer,

2319
01:41:31,936 --> 01:41:34,586
I can at least still have a
defense in place, so I'm going

2320
01:41:34,586 --> 01:41:38,106
to go users, I'm going to
click not edit cause edit,

2321
01:41:38,106 --> 01:41:40,746
it would just edit my table
instead I'm going to click

2322
01:41:40,746 --> 01:41:44,106
on structure at the top and
now here's just a reminder

2323
01:41:44,106 --> 01:41:47,036
of what we did earlier, so
I've got my username field,

2324
01:41:47,036 --> 01:41:48,986
VARCHAR, Swedish and so forth.

2325
01:41:48,986 --> 01:41:53,276
I'm going to go ahead and
check all and then click change

2326
01:41:53,596 --> 01:41:55,526
and now I'm just going
to see again that form

2327
01:41:55,526 --> 01:41:57,176
that we saw a while ago.

2328
01:41:57,476 --> 01:41:59,456
So, what do I want to change
here, I'm going to scroll

2329
01:41:59,456 --> 01:42:01,066
over to the side and I'm going

2330
01:42:01,066 --> 01:42:06,126
to change the notice here
I can longer do it here,

2331
01:42:06,176 --> 01:42:07,706
I actually forget,
can't do this here.

2332
01:42:08,066 --> 01:42:11,486
So, no longer do I have that
dropdown that had uniqueness

2333
01:42:11,486 --> 01:42:14,126
and that's because the table
already exists, so it's actually

2334
01:42:14,126 --> 01:42:16,716
on another screen and this
is just a PHPMyadmin thing,

2335
01:42:16,716 --> 01:42:19,796
it's not compelling for any
other reason and if I scroll

2336
01:42:19,796 --> 01:42:25,106
down here notice that next to
user name I have a few options.

2337
01:42:25,646 --> 01:42:30,956
So, I have an add unique index,
add index, add spatial index,

2338
01:42:31,046 --> 01:42:34,696
add full text index and actually
there's another way I can

2339
01:42:34,696 --> 01:42:37,446
actually do this,
if I instead go

2340
01:42:37,446 --> 01:42:39,766
over here I can check username

2341
01:42:40,126 --> 01:42:42,236
and then notice these
icons here,

2342
01:42:42,496 --> 01:42:44,186
I can check primary or unique.

2343
01:42:44,376 --> 01:42:46,586
Frankly this UI is
a bit of a mess,

2344
01:42:46,586 --> 01:42:48,066
but in short I have a number

2345
01:42:48,066 --> 01:42:51,706
of alterations I can
now make on this field.

2346
01:42:52,316 --> 01:42:54,026
So, what do I want to say?

2347
01:42:54,026 --> 01:42:56,556
I do want to say
username should be unique,

2348
01:42:56,996 --> 01:43:01,276
but there's a special word given
for unique keys if those keys

2349
01:43:01,536 --> 01:43:04,066
or rather those fields
also are supposed

2350
01:43:04,066 --> 01:43:06,756
to uniquely identify
rows in your table.

2351
01:43:07,446 --> 01:43:10,786
So, in this case username
uniquely identifies my users

2352
01:43:10,786 --> 01:43:14,716
or should at least, so I could
make a unique field by clicking

2353
01:43:14,716 --> 01:43:16,876
where the cursor is now at
top right, clicking unique

2354
01:43:17,236 --> 01:43:18,806
but technically in the
future if I'm going

2355
01:43:18,806 --> 01:43:23,406
to use this field jharvard
to uniquely identify users,

2356
01:43:23,656 --> 01:43:27,516
this is by definition
what's called a primary key.

2357
01:43:27,516 --> 01:43:29,966
It is the key, the field,
the column that you use

2358
01:43:29,966 --> 01:43:32,876
to uniquely identify your
users which it to say,

2359
01:43:32,916 --> 01:43:35,036
your table can have
other unique fields even

2360
01:43:35,036 --> 01:43:37,536
if you don't actually use
them to identify your users,

2361
01:43:37,726 --> 01:43:41,346
for instance, what's another
field that I in theory a user,

2362
01:43:41,576 --> 01:43:43,486
only one user in the
world should have one of?

2363
01:43:43,486 --> 01:43:43,736
Phone number?

2364
01:43:44,516 --> 01:43:51,586
[ Pause ]

2365
01:43:52,086 --> 01:43:53,876
Cell phone maybe, but
you know some of us,

2366
01:43:53,956 --> 01:43:56,276
some of you still
have landlines, right?

2367
01:43:56,346 --> 01:43:58,436
And whole families or roommates.

2368
01:43:58,436 --> 01:43:58,786
What?

2369
01:43:58,906 --> 01:43:59,806
>> Social security number.

2370
01:43:59,806 --> 01:44:01,156
>> Okay, so social
security number.

2371
01:44:01,156 --> 01:44:04,726
That is supposed to be a
unique number per person

2372
01:44:04,906 --> 01:44:07,066
and so you might want to
enforce that in the database

2373
01:44:07,176 --> 01:44:08,676
but frankly not everyone

2374
01:44:08,676 --> 01:44:10,706
in the world has a
social security number

2375
01:44:10,876 --> 01:44:12,616
so you might want
that field to be Null,

2376
01:44:12,716 --> 01:44:15,146
but when it's not Null
you want it to be unique

2377
01:44:15,286 --> 01:44:17,416
so you can have the database
at least enforce that.

2378
01:44:17,666 --> 01:44:18,636
What about email address?

2379
01:44:18,636 --> 01:44:21,706
Well, email address in theory
should be unique unless you're

2380
01:44:21,706 --> 01:44:22,966
sharing an account or something,

2381
01:44:23,236 --> 01:44:26,286
but it isn't necessarily what I
want to identify my users with,

2382
01:44:26,286 --> 01:44:28,766
if only because an email
address might be this long.

2383
01:44:28,766 --> 01:44:31,836
What would actually be a better
datatype to use in general

2384
01:44:31,836 --> 01:44:34,516
for uniquely identifying rows
and a table do you think?

2385
01:44:34,666 --> 01:44:35,536
What does Excel use?

2386
01:44:36,006 --> 01:44:36,196
>> Index.

2387
01:44:36,736 --> 01:44:38,076
>> What's an index though?

2388
01:44:38,436 --> 01:44:44,236
>> It's like an iterative
number.

2389
01:44:44,236 --> 01:44:44,606
>> Yeah, [inaudible] number.

2390
01:44:44,756 --> 01:44:45,776
So, just a number, right?

2391
01:44:45,866 --> 01:44:48,346
One, 2, 3, or maybe we
started with zero, 1, 2, 3.

2392
01:44:48,556 --> 01:44:49,516
So, just a number, why?

2393
01:44:49,516 --> 01:44:52,676
Cause then it's only 32 bits
or 64 bits, so it's only 4

2394
01:44:52,676 --> 01:44:55,296
or 8 bytes meanwhile my
email address, you know,

2395
01:44:55,296 --> 01:44:58,056
it could 10 characters, 20
characters even depending;

2396
01:44:58,286 --> 01:45:00,846
social security number is
similarly long with hyphens

2397
01:45:00,846 --> 01:45:03,316
and what not, so in
short, the best candidate

2398
01:45:03,316 --> 01:45:06,056
for a unique key is
probably not a string at all.

2399
01:45:06,266 --> 01:45:07,966
So, let's actually
alter this table.

2400
01:45:07,966 --> 01:45:11,866
I'm going to go ahead and
before I change the key notice

2401
01:45:11,866 --> 01:45:13,636
down here I can make
some alterations.

2402
01:45:13,636 --> 01:45:16,856
I'm going to say add a column
at the beginning of the table,

2403
01:45:17,216 --> 01:45:18,896
although where it goes
doesn't really matter,

2404
01:45:18,896 --> 01:45:22,276
but it's common convention
to put your primary key

2405
01:45:22,276 --> 01:45:26,106
at the top of, at the beginning
of your table, I'm click go,

2406
01:45:26,266 --> 01:45:28,766
this is going to allow me to
add a new field and I'm going

2407
01:45:28,766 --> 01:45:32,196
to call it aptly ID and I'm
going to leave this blank,

2408
01:45:32,196 --> 01:45:35,726
this blank, this blank, I'm
going to leave this as none,

2409
01:45:35,726 --> 01:45:37,796
I don't want a default value
and I'm going to go ahead

2410
01:45:37,796 --> 01:45:42,466
and click save and now notice
I have in int whose size

2411
01:45:42,466 --> 01:45:44,036
or type is apparently 11.

2412
01:45:44,366 --> 01:45:47,216
This is legacy thing, this does
not mean your integer can have

2413
01:45:47,216 --> 01:45:50,636
11 digits, it does not mean
your integer can have 11 bits,

2414
01:45:50,976 --> 01:45:54,606
it means when you are in
this black and white window

2415
01:45:54,606 --> 01:45:56,506
that we saw earlier
and you print

2416
01:45:56,506 --> 01:46:00,596
out these old school looking
columns like this thing here,

2417
01:46:00,986 --> 01:46:06,626
that means your integer will
use 11 characters in this black

2418
01:46:06,626 --> 01:46:08,626
and white interface,
so completely legacy,

2419
01:46:08,896 --> 01:46:12,396
so it's pretty much irrelevant;
11 was just the default now.

2420
01:46:12,846 --> 01:46:15,176
Alright, so now I
have some power.

2421
01:46:15,176 --> 01:46:17,426
I'm going to go ahead
and change this.

2422
01:46:17,426 --> 01:46:21,966
I'm going to go ahead, whoops,
let me go back to my structure.

2423
01:46:22,266 --> 01:46:29,006
I'm going to go to ID and let's
say what do I want to do here,

2424
01:46:29,056 --> 01:46:35,046
change, I'm going to go to why
is not letting me, there it is,

2425
01:46:35,526 --> 01:46:38,236
change, oh I didn't scroll over
far enough before this was here,

2426
01:46:38,706 --> 01:46:40,826
so AI, does anyone
recall what this is?

2427
01:46:40,826 --> 01:46:40,976
Yeah.

2428
01:46:41,986 --> 01:46:44,156
>> Part of auto-increment.

2429
01:46:44,536 --> 01:46:45,326
>> Auto-increment.

2430
01:46:45,326 --> 01:46:48,786
>> I think it will when you add
another user it's automatically

2431
01:46:48,786 --> 01:46:50,996
going to be two if
the first one is one.

2432
01:46:50,996 --> 01:46:51,486
>> Exactly.

2433
01:46:51,536 --> 01:46:51,926
>> Add [inaudible].

2434
01:46:52,686 --> 01:46:53,946
>> So, this is yet
another feature

2435
01:46:53,946 --> 01:46:55,786
of the database that's
not just nice to have,

2436
01:46:55,786 --> 01:46:57,286
it's really compelling.

2437
01:46:57,556 --> 01:46:59,726
So, if you want to
uniquely identify your users

2438
01:46:59,726 --> 01:47:02,556
as efficiently as possible much
like Facebook does with a number

2439
01:47:02,556 --> 01:47:05,326
in the URL unless you chose
a custom yeah username

2440
01:47:05,326 --> 01:47:08,726
for Facebook; a number is the
way to go; 32 bits, 64 bits,

2441
01:47:08,726 --> 01:47:10,986
definitely nice and
predictable instead

2442
01:47:10,986 --> 01:47:14,096
of a variable length string,
but you don't really want to be

2443
01:47:14,096 --> 01:47:15,256
in the business frankly
of figuring

2444
01:47:15,256 --> 01:47:18,076
out what user ID is available,
right, two would be kind

2445
01:47:18,076 --> 01:47:20,716
of annoying if anytime a
user registered you first had

2446
01:47:20,716 --> 01:47:23,716
to figure out the highest
number of the previous user's ID

2447
01:47:23,896 --> 01:47:26,666
and then choose the next number
by adding one, it's not hard,

2448
01:47:26,666 --> 01:47:28,706
I mean it's trivial
mentally to do that

2449
01:47:28,986 --> 01:47:31,936
but it's just extra work plus
you run into the situation

2450
01:47:31,936 --> 01:47:33,716
of what if two people
try to register ever

2451
01:47:33,716 --> 01:47:35,616
so slightly the same
time, you're going to get

2452
01:47:35,616 --> 01:47:37,136
into this so-called
race condition,

2453
01:47:37,136 --> 01:47:38,806
but again more on
that next time.

2454
01:47:38,956 --> 01:47:41,456
So, I'm going to choose
auto-increment here

2455
01:47:41,616 --> 01:47:44,206
because what this means
is when I do an insert

2456
01:47:44,206 --> 01:47:49,976
in the future I am going to
have the ID field automatically

2457
01:47:49,976 --> 01:47:50,966
assigned for me.

2458
01:47:51,266 --> 01:47:52,236
Now, I can't just do this

2459
01:47:52,236 --> 01:47:54,116
yet cause notice
incorrect table definition,

2460
01:47:54,116 --> 01:47:55,476
there can only be
one auto column

2461
01:47:55,476 --> 01:47:56,896
and it must be defined as a key.

2462
01:47:57,176 --> 01:48:00,306
We haven't finished that part of
the story of making this a key.

2463
01:48:00,606 --> 01:48:04,236
So, rather than use
username as my primary key

2464
01:48:04,346 --> 01:48:07,756
that uniquely identifies my
users, I'm instead going to say,

2465
01:48:07,756 --> 01:48:10,386
you're going what, ID
will be my primary key

2466
01:48:10,586 --> 01:48:13,196
and this is the SEQUEL query
that was just executed,

2467
01:48:13,196 --> 01:48:20,126
alter table users, add primary
key on ID and now I'm going

2468
01:48:20,126 --> 01:48:22,386
to go and change
this field and scroll

2469
01:48:22,386 --> 01:48:24,926
over to the side make it
auto-increment and save

2470
01:48:25,186 --> 01:48:28,356
and now notice under the extra
column, I'm just being reminded

2471
01:48:28,356 --> 01:48:30,596
that this has the
auto-increment flag on it

2472
01:48:30,596 --> 01:48:35,046
and here's how this worked;
alter table users, change ID,

2473
01:48:35,046 --> 01:48:39,846
ID to not Null, auto-increments.

2474
01:48:40,896 --> 01:48:44,506
Alright, so username that still
leaves the question of username.

2475
01:48:44,916 --> 01:48:49,386
Should username have any
kind of index or key?

2476
01:48:49,896 --> 01:48:51,766
Well, do you want
username to be unique?

2477
01:48:52,426 --> 01:48:54,716
So, we still do.

2478
01:48:55,026 --> 01:48:57,396
You can only have one
primary key though

2479
01:48:57,446 --> 01:48:59,706
and we'll see why next
time too why you want

2480
01:48:59,706 --> 01:49:02,396
to have a primary key as
opposed to just a unique index,

2481
01:49:02,506 --> 01:49:04,946
but I'm going to go over here
and say this should be unique,

2482
01:49:05,296 --> 01:49:07,696
so now what will happen, whoops,

2483
01:49:07,786 --> 01:49:09,656
my cursor is doing
strange things.

2484
01:49:09,806 --> 01:49:12,686
Here's browse, we only
have one user, so let's try

2485
01:49:12,686 --> 01:49:14,396
to insert someone
else into the database

2486
01:49:14,396 --> 01:49:16,596
and we'll do it the real
way with SEQUEL commands.

2487
01:49:17,006 --> 01:49:23,056
So, insert into users a
username and a password

2488
01:49:23,056 --> 01:49:26,076
and I'll put my quotes around
everything just good measure,

2489
01:49:26,076 --> 01:49:28,136
also because password
is a function recall.

2490
01:49:28,586 --> 01:49:34,536
The values of "malan"
"1, 2, 3, 4,

2491
01:49:35,216 --> 01:49:38,206
5" and semicolon is
not necessary here,

2492
01:49:38,206 --> 01:49:40,396
it's only necessary in
the command line clients.

2493
01:49:41,016 --> 01:49:44,086
So, let's do go, okay
so that was inserted.

2494
01:49:44,086 --> 01:49:46,666
If I go to browse now,
notice I have two rows,

2495
01:49:46,966 --> 01:49:48,156
malan and jharvard.

2496
01:49:48,426 --> 01:49:50,256
Now, let's try to
register malan again

2497
01:49:50,436 --> 01:49:53,556
but notice one thing first,
what was John Harvard's ID

2498
01:49:54,036 --> 01:49:55,806
and what was malan's ID?

2499
01:49:57,256 --> 01:49:58,306
One and two.

2500
01:49:58,466 --> 01:50:00,466
So, exactly what Axel
promised would happen,

2501
01:50:00,886 --> 01:50:03,206
the nice thing here is I didn't
have to think about that,

2502
01:50:03,206 --> 01:50:05,166
I didn't have to
insert two myself.

2503
01:50:05,166 --> 01:50:06,866
I won't have to insert
three myself,

2504
01:50:06,866 --> 01:50:09,916
if it's auto-increment auto is
literally the keyword here it

2505
01:50:09,916 --> 01:50:11,226
happens automatically for you

2506
01:50:11,486 --> 01:50:14,196
and you avoid this thing
called a race condition

2507
01:50:14,286 --> 01:50:16,236
where two people might be
registering at the same time

2508
01:50:16,476 --> 01:50:18,536
and might otherwise, if
I were writing the code,

2509
01:50:18,796 --> 01:50:20,666
give them the same ID
which would be bad.

2510
01:50:20,896 --> 01:50:23,036
So, now let's test our
uniqueness constraint.

2511
01:50:23,036 --> 01:50:26,156
Let's go into SEQUEL
again and let's do insert

2512
01:50:26,156 --> 01:50:32,856
into users a username
and a password of,

2513
01:50:32,856 --> 01:50:37,616
with these values malan
and a password of 5555,

2514
01:50:37,826 --> 01:50:39,996
so different password,
alright go;

2515
01:50:42,136 --> 01:50:45,566
duplicate entry malan
for key username.

2516
01:50:45,876 --> 01:50:47,086
So, what does this really mean?

2517
01:50:47,086 --> 01:50:48,746
So, this is an error
now that's happened

2518
01:50:48,746 --> 01:50:51,076
so if you called MySQL
query you would actually be

2519
01:50:51,076 --> 01:50:53,996
in form programmatically in
PHP an error has happened

2520
01:50:53,996 --> 01:50:57,396
which is you're way of inferring
I must have tried inserting the

2521
01:50:57,466 --> 01:51:01,596
same username twice, so we now
have this defense in place.

2522
01:51:02,496 --> 01:51:05,356
Alright, so this table is
coming along and it turns

2523
01:51:05,356 --> 01:51:07,376
out that primary keys are
going to have a relationship

2524
01:51:07,376 --> 01:51:08,916
with something called
foreign keys,

2525
01:51:09,566 --> 01:51:12,206
but more on those separately.

2526
01:51:13,236 --> 01:51:16,576
What else motivates this choice?

2527
01:51:17,206 --> 01:51:21,776
What might you gain by
telling the database in advance

2528
01:51:22,066 --> 01:51:25,426
that this field is special,
that this field is unique?

2529
01:51:26,406 --> 01:51:31,206
What might that lend itself
to that's not too big?

2530
01:51:31,706 --> 01:51:34,476
It's actually the answer
to; it's the same answer

2531
01:51:34,476 --> 01:51:35,566
to every question
thus far tonight.

2532
01:51:35,686 --> 01:51:35,753
Yeah.

2533
01:51:36,106 --> 01:51:39,926
>> Maybe it has to do with
something with performance.

2534
01:51:39,926 --> 01:51:40,896
>> Yeah, performance.

2535
01:51:41,166 --> 01:51:41,646
Good answer.

2536
01:51:41,906 --> 01:51:43,596
So, it does actually have
to do with performance.

2537
01:51:43,816 --> 01:51:48,226
When you create an index
as it's called or a key,

2538
01:51:48,226 --> 01:51:50,386
key index they're essentially
referring to the same thing.

2539
01:51:50,856 --> 01:51:52,776
When you create an
index on a field,

2540
01:51:52,776 --> 01:51:55,186
on a column in a database,
the database is going

2541
01:51:55,186 --> 01:51:58,606
to spend some effort, some time
and some disc space upfront

2542
01:51:59,136 --> 01:52:03,606
to optimize that column and by
that I mean it's going to build

2543
01:52:03,606 --> 01:52:04,976
up a secondary data structure

2544
01:52:04,976 --> 01:52:07,326
which is usually a data
structure of a tree,

2545
01:52:07,566 --> 01:52:10,156
a B-tree in fact if you've
take a data structure's class

2546
01:52:10,436 --> 01:52:12,216
and a B-tree is essentially
a very shallow tree

2547
01:52:12,216 --> 01:52:14,446
that lends itself to
searching large chunks of data

2548
01:52:14,446 --> 01:52:16,276
like you might have in
a big database table.

2549
01:52:16,516 --> 01:52:20,976
It creates this index, this
B-tree that makes it much easier

2550
01:52:20,976 --> 01:52:24,816
to answer questions of
the form who is ID two?

2551
01:52:25,306 --> 01:52:27,706
Or is jharvard in this table?

2552
01:52:28,026 --> 01:52:30,916
Any query you might want to
ask about a specific field

2553
01:52:30,986 --> 01:52:34,676
in this case ID or even the
username field will be faster.

2554
01:52:35,006 --> 01:52:38,236
Previously, when we did
the select star from users

2555
01:52:38,236 --> 01:52:39,766
where username equals jharvard,

2556
01:52:40,016 --> 01:52:42,096
that was a linear search
of the entire table.

2557
01:52:42,686 --> 01:52:44,936
Now, you were not unimpressed
because there was one person

2558
01:52:44,936 --> 01:52:46,736
in the table, so it's
obviously not that slow

2559
01:52:46,736 --> 01:52:49,876
to search the table, but if we
did have Facebook's 500 million

2560
01:52:49,876 --> 01:52:51,466
plus or 800 million plus users,

2561
01:52:51,766 --> 01:52:54,366
that would have had been
a linear search looking

2562
01:52:54,366 --> 01:52:57,026
for John Harvard, awful,
especially for large datasets.

2563
01:52:57,386 --> 01:52:59,766
Now, if you instead
click that button

2564
01:52:59,766 --> 01:53:01,666
and say make this
a unique index,

2565
01:53:01,666 --> 01:53:03,766
or make this a primary
key index,

2566
01:53:04,166 --> 01:53:06,676
then the database is going
to churn through that list

2567
01:53:06,676 --> 01:53:09,526
and will do linear search
maybe once, maybe twice,

2568
01:53:09,526 --> 01:53:12,776
maybe three times, but the
output of that process is going

2569
01:53:12,776 --> 01:53:14,536
to be some kind of tree
structure that's kept

2570
01:53:14,536 --> 01:53:17,556
around in RAM so that the next
time you ask me a question,

2571
01:53:17,706 --> 01:53:20,216
it's going to be much faster to
answer the query and it's going

2572
01:53:20,216 --> 01:53:21,946
to do something like
binary search

2573
01:53:21,946 --> 01:53:23,976
or even something
fancier than that.

2574
01:53:24,386 --> 01:53:25,986
So, indexes are huge

2575
01:53:25,986 --> 01:53:28,976
and if you've ever visited a
really bad website in terms

2576
01:53:28,976 --> 01:53:31,996
of performance, it's slow,
going from page to page,

2577
01:53:31,996 --> 01:53:33,936
has nothing to do with your
Internet connection cause you're

2578
01:53:33,936 --> 01:53:35,366
at home on Broadband
or what not,

2579
01:53:35,576 --> 01:53:38,856
just a really crappy website
odds are it has to do with one,

2580
01:53:38,856 --> 01:53:41,076
server could just be overloaded
and there are too popular

2581
01:53:41,076 --> 01:53:44,126
for their own good, or are very
likely they just didn't know

2582
01:53:44,126 --> 01:53:46,176
what they were doing when
creating their database tables

2583
01:53:46,176 --> 01:53:48,836
and they just created rows and
columns like you would in Excel;

2584
01:53:49,196 --> 01:53:52,976
they gave not thought to primary
keys, indexes, uniqueness,

2585
01:53:52,976 --> 01:53:53,986
or anything like that.

2586
01:53:54,566 --> 01:53:58,136
Full text is similar in spirit,
it can be used for text fields,

2587
01:53:58,406 --> 01:54:00,216
so here's one of these design
tradeoffs we didn't touch

2588
01:54:00,216 --> 01:54:03,216
on earlier, VARCHARs are great
for variable length strings.

2589
01:54:03,666 --> 01:54:05,966
Text fields are great for
variable length strings

2590
01:54:05,966 --> 01:54:08,826
that are even longer, but you do
pay a performance penalty cause

2591
01:54:08,826 --> 01:54:11,086
the text fields remember
end up elsewhere

2592
01:54:11,086 --> 01:54:13,056
which just means they're
not as local, caching issues

2593
01:54:13,056 --> 01:54:16,656
and so forth, but the
upside of using a text field

2594
01:54:16,656 --> 01:54:19,826
which is bigger is that you
can put a full text index on it

2595
01:54:20,146 --> 01:54:22,156
which means you can do
Google like query is

2596
01:54:22,156 --> 01:54:26,846
on the search saying return this
row if it has the keywords foo,

2597
01:54:26,846 --> 01:54:30,846
and bar, and not baz or things
like that and you don't have

2598
01:54:30,846 --> 01:54:33,566
to implement that yourself the
database can do it for you.

2599
01:54:33,856 --> 01:54:35,516
The price you pay
though is performance

2600
01:54:35,516 --> 01:54:37,556
for just selecting
the data potentially.

2601
01:54:37,556 --> 01:54:40,386
So, again, there's no perfect
solution here it really depends

2602
01:54:40,386 --> 01:54:44,416
on the use case you're
trying to solve.

2603
01:54:44,586 --> 01:54:46,516
So, let's consider a
problem that we're not going

2604
01:54:46,516 --> 01:54:49,326
to solve tonight, but
that does kind remain.

2605
01:54:49,526 --> 01:54:51,576
Suppose, we augment
our user's table

2606
01:54:51,816 --> 01:54:55,106
to include not just username and
password and not even just ID,

2607
01:54:55,106 --> 01:54:56,526
but again what are some
things we might want

2608
01:54:56,526 --> 01:54:57,606
to associate with a user?

2609
01:54:58,306 --> 01:54:58,586
Axel.

2610
01:54:58,856 --> 01:55:00,016
>> Their email address.

2611
01:55:00,156 --> 01:55:00,896
>> Email address, good.

2612
01:55:00,896 --> 01:55:01,456
Give me something else.

2613
01:55:02,096 --> 01:55:03,326
Yeah, Conner.

2614
01:55:03,326 --> 01:55:04,106
>> Phone number.

2615
01:55:04,246 --> 01:55:04,986
>> Phone number, good.

2616
01:55:04,986 --> 01:55:05,966
Something else.

2617
01:55:06,776 --> 01:55:06,936
>> Gender.

2618
01:55:07,086 --> 01:55:07,936
>> Gender, good.

2619
01:55:07,936 --> 01:55:08,906
Something else, we'll
keep doing this

2620
01:55:08,906 --> 01:55:12,626
until we get the answer
I need to tell the story.

2621
01:55:12,726 --> 01:55:12,793
Yeah.

2622
01:55:12,793 --> 01:55:14,736
>> Pictures of the user.

2623
01:55:14,736 --> 01:55:15,336
>> What's that?

2624
01:55:15,386 --> 01:55:15,606
>> Pictures.

2625
01:55:15,856 --> 01:55:17,306
>> Pictures of the user, good.

2626
01:55:17,656 --> 01:55:20,596
Oh, actually we can tell
a quick story about this.

2627
01:55:20,996 --> 01:55:22,316
How do you store binary data?

2628
01:55:22,606 --> 01:55:25,316
So, you actually can store it
in a database and in fact one

2629
01:55:25,316 --> 01:55:27,856
of the fields that we
didn't look at but was

2630
01:55:27,856 --> 01:55:31,766
on the screen there briefly
when you create a new field,

2631
01:55:31,766 --> 01:55:33,826
let me try to simulate it,
let me go to structure,

2632
01:55:34,216 --> 01:55:38,126
let me go ahead and add a
new field and show the types.

2633
01:55:38,126 --> 01:55:41,056
At the bottom here, there's
some fancier features

2634
01:55:41,056 --> 01:55:45,056
that we really haven't even
scratched the surface of BLOBs,

2635
01:55:45,416 --> 01:55:48,966
binary large objects and this
just refers to binary data,

2636
01:55:48,966 --> 01:55:51,246
so you could actually
store photographs of users

2637
01:55:51,306 --> 01:55:53,976
in the database or you
could store them on disc

2638
01:55:53,976 --> 01:55:54,786
and this is actually one

2639
01:55:54,786 --> 01:55:56,966
of these other nonobvious
design decisions, but this one

2640
01:55:56,966 --> 01:55:58,826
at least there's some
good rules of thumb.

2641
01:55:59,506 --> 01:56:04,046
Frankly, I am of the philosophy
that data belongs in a database

2642
01:56:04,046 --> 01:56:07,226
and files belong on a file
system and by this I mean

2643
01:56:07,226 --> 01:56:09,966
if you are having your users
upload photos or resumes

2644
01:56:09,966 --> 01:56:11,326
or what not, storing them

2645
01:56:11,326 --> 01:56:13,586
in the database is probably
the not the right place,

2646
01:56:13,586 --> 01:56:15,506
cause the database is going
to get bigger and bigger

2647
01:56:15,506 --> 01:56:17,926
and bigger, I mean files
are generally much bigger

2648
01:56:18,056 --> 01:56:20,906
than textual rows and so
you're going run eventually

2649
01:56:20,906 --> 01:56:22,856
into performance issues,
you're going to run potentially

2650
01:56:22,856 --> 01:56:24,326
into disc space issues,
you're going to run

2651
01:56:24,326 --> 01:56:26,946
into replication
issues whereby very much

2652
01:56:26,946 --> 01:56:28,446
in Vogue [assumed
spelling] these days or CDNs,

2653
01:56:28,446 --> 01:56:31,696
content delivery networks which
are just severs like Akamai

2654
01:56:31,876 --> 01:56:34,886
and Google and the like, and
Facebook has this too, hundreds

2655
01:56:34,886 --> 01:56:38,236
or thousands of servers whose
purpose in life is just to serve

2656
01:56:38,236 --> 01:56:41,876
up static content, JavaScript
files, jpegs, ping, movie files

2657
01:56:41,876 --> 01:56:43,306
and what not and
that's all they do.

2658
01:56:43,626 --> 01:56:46,316
If instead you're storing that
data in a database now you have

2659
01:56:46,316 --> 01:56:48,846
to replicate your database
around the entire world

2660
01:56:48,846 --> 01:56:51,246
and MySQL databases
here and here and here

2661
01:56:51,246 --> 01:56:54,206
which is totally possible,
but completely unnecessary

2662
01:56:54,346 --> 01:56:56,846
for scalability, so we'll talk
more about this at the end

2663
01:56:56,846 --> 01:57:00,356
of the semester but in
general, I think in like

2664
01:57:00,356 --> 01:57:03,586
that past 6-10 years I've
never stored binary data

2665
01:57:03,706 --> 01:57:06,636
in a database really,
at least not files.

2666
01:57:07,136 --> 01:57:10,746
Better to store the file on
disc in a folder for instance

2667
01:57:10,746 --> 01:57:13,276
of uploads that's owned
by you, [inaudible] suPHP

2668
01:57:13,276 --> 01:57:15,546
which we talked about
a couple times ago,

2669
01:57:15,986 --> 01:57:17,926
but what could you still
store in the database?

2670
01:57:18,466 --> 01:57:18,556
>> Path.

2671
01:57:19,066 --> 01:57:19,286
>> The path.

2672
01:57:19,846 --> 01:57:21,626
So you store the
path or the name

2673
01:57:21,626 --> 01:57:24,176
of the file that's uploaded, so
you still associate the thing

2674
01:57:24,176 --> 01:57:27,596
with the user but you store it
in a more natural environment

2675
01:57:27,666 --> 01:57:29,546
that lends itself to scalability

2676
01:57:29,546 --> 01:57:31,166
and access controls
and the like.

2677
01:57:31,496 --> 01:57:33,306
Now, there's some other
cool ones if you like things

2678
01:57:33,336 --> 01:57:38,996
like geography and anything
related to points and latitudes

2679
01:57:38,996 --> 01:57:41,346
and longitudes; there's nice
built in support in MySQL

2680
01:57:41,596 --> 01:57:44,566
for that kind of stuff too where
you can actually do queries

2681
01:57:44,566 --> 01:57:46,136
like is this a latitude,

2682
01:57:46,136 --> 01:57:49,916
longitudinal point nearby this
other one and similar queries

2683
01:57:49,966 --> 01:57:51,876
which you would have to
otherwise implement yourself

2684
01:57:52,256 --> 01:57:56,726
in CSV or XML or any
other domain like that.

2685
01:57:57,156 --> 01:58:00,936
Alright, so let me try to coax
this last story a little further

2686
01:58:00,936 --> 01:58:03,046
along, what else might you
want to associate with a user?

2687
01:58:03,596 --> 01:58:03,786
>> Address.

2688
01:58:05,536 --> 01:58:06,996
>> Address, there we go.

2689
01:58:06,996 --> 01:58:11,236
Okay, so address is interesting
because it may be the case

2690
01:58:11,426 --> 01:58:13,456
that you have unique
postal addresses

2691
01:58:13,456 --> 01:58:15,856
or home addresses unless you
have a family or a roommate,

2692
01:58:15,856 --> 01:58:17,336
so there's some corner
cases there

2693
01:58:17,696 --> 01:58:19,436
but let's start telling
this story.

2694
01:58:19,436 --> 01:58:21,446
So, now John Harvard
lives for instance

2695
01:58:21,446 --> 01:58:23,556
at 33 Oxford Street,
Cambridge, Mass.

2696
01:58:23,556 --> 01:58:26,076
02138. Suppose, David
Malan also lives

2697
01:58:26,076 --> 01:58:28,236
at 33 Oxford Street,
Cambridge, Mass.

2698
01:58:28,236 --> 01:58:32,106
02138 and so and
so on and so on,

2699
01:58:32,766 --> 01:58:35,216
what's the redundancy
there in particular?

2700
01:58:35,716 --> 01:58:35,816
Yeah.

2701
01:58:36,296 --> 01:58:40,676
>> You're storing
the same string

2702
01:58:40,676 --> 01:58:41,456
or whatever multiple times.

2703
01:58:41,456 --> 01:58:41,796
>> Exactly.

2704
01:58:41,796 --> 01:58:44,376
What string is really
redundant here or strings?

2705
01:58:44,376 --> 01:58:44,546
Jack.

2706
01:58:44,876 --> 01:58:45,296
>> Cambridge, Mass.

2707
01:58:45,546 --> 01:58:46,126
>> Cambridge, Mass.

2708
01:58:46,126 --> 01:58:48,076
Why the heck am I storing
Cambridge, Mass; Cambridge,

2709
01:58:48,076 --> 01:58:49,686
Mass; Cambridge,
Mass; Cambridge, Mass.

2710
01:58:49,686 --> 01:58:53,616
again and again when really I
could identify Cambridge, Mass.

2711
01:58:53,646 --> 01:58:54,166
by what?

2712
01:58:54,166 --> 01:58:54,476
>> Zip code.

2713
01:58:55,076 --> 01:58:55,756
>> By zip code.

2714
01:58:56,056 --> 01:58:59,116
Now as aside, the US is kind
of a mess with the zip codes

2715
01:58:59,116 --> 01:59:00,276
and over time I've learned

2716
01:59:00,276 --> 01:59:03,236
that zip codes actually don't
always follow town boundaries

2717
01:59:03,236 --> 01:59:05,416
and there can be weird
overlap and so forth

2718
01:59:05,616 --> 01:59:08,116
so this is a nicer story than
it is actually in practice.

2719
01:59:08,116 --> 01:59:11,256
Sometimes town share area,
zip codes and vice versa.

2720
01:59:11,696 --> 01:59:14,306
So, kind of a mess but that's
the human's fault not the

2721
01:59:14,306 --> 01:59:17,226
computer people's fault in this
case, so let's at least assume

2722
01:59:17,426 --> 01:59:20,226
that zip codes do uniquely
identify cities and states,

2723
01:59:20,536 --> 01:59:22,786
so this is great cause
I can store five digits

2724
01:59:22,916 --> 01:59:27,816
of CHAR field even, 02138
and then how do I remember

2725
01:59:27,816 --> 01:59:29,566
that 02138 maps to
Cambridge, Mass?

2726
01:59:29,676 --> 01:59:29,776
Jack.

2727
01:59:29,776 --> 01:59:32,276
>> Is there another database
somewhere that has all

2728
01:59:32,376 --> 01:59:33,176
of the areas that
have the existed

2729
01:59:33,176 --> 01:59:39,546
or have been put
up on your server?

2730
01:59:39,546 --> 01:59:40,236
>> Yeah, exactly.

2731
01:59:40,236 --> 01:59:42,316
So, you can buy off the
Internet a big database

2732
01:59:42,316 --> 01:59:44,486
of zip codes with city, states.

2733
01:59:44,646 --> 01:59:46,586
So, you could just buy
that or find it somewhere

2734
01:59:46,726 --> 01:59:48,856
and then you could store your
own local copy cause what you

2735
01:59:48,856 --> 01:59:50,626
can do with MySQL even
though we haven't done it

2736
01:59:50,626 --> 01:59:53,076
yet is you can have more
than one table, right?

2737
01:59:53,076 --> 01:59:54,856
Just like Excel can have
multiple worksheets.

2738
01:59:54,856 --> 01:59:56,446
MySQL can have multiple tables,

2739
01:59:56,446 --> 01:59:58,066
so I can have one
table called users

2740
01:59:58,356 --> 02:00:00,736
and another table called cities

2741
02:00:00,736 --> 02:00:03,666
or another table called zips
whatever I want to call it

2742
02:00:03,666 --> 02:00:07,116
and what would the primary
key be in the zips table?

2743
02:00:07,696 --> 02:00:10,896
And what would the columns be?

2744
02:00:10,896 --> 02:00:11,016
Yeah.

2745
02:00:11,306 --> 02:00:11,776
>> The zip code.

2746
02:00:11,976 --> 02:00:13,176
>> Yeah, probably the zip code.

2747
02:00:13,326 --> 02:00:15,426
Could be a unique
number like an integer

2748
02:00:15,426 --> 02:00:18,136
but if I already have a
number and it's a fixed length

2749
02:00:18,176 --> 02:00:19,626
like this is actually
a pretty good candidate

2750
02:00:19,626 --> 02:00:22,636
for a primary key, so what else
would be in the zips table?

2751
02:00:23,216 --> 02:00:24,836
>> The address.

2752
02:00:24,946 --> 02:00:27,586
>> Yeah, city and state and
I could push a little harder.

2753
02:00:27,586 --> 02:00:31,936
I could factor out anyone who
has 33 Oxford Street somehow,

2754
02:00:32,246 --> 02:00:35,156
but frankly storing the
street address redundantly is

2755
02:00:35,156 --> 02:00:36,196
probably okay.

2756
02:00:36,466 --> 02:00:40,116
Storing city, state again
and again doesn't feel nearly

2757
02:00:40,116 --> 02:00:43,206
as okay cause right, how
do I uniquely identify 33

2758
02:00:43,206 --> 02:00:43,976
Oxford Street?

2759
02:00:44,266 --> 02:00:46,016
I have to standardize
on a name for it

2760
02:00:46,016 --> 02:00:49,976
like 33 space Oxford space,
we'd have to make more

2761
02:00:49,976 --> 02:00:52,336
of a design decision there
whereas at least with city,

2762
02:00:52,336 --> 02:00:55,076
state and zip that in
theory should have a nicer

2763
02:00:55,076 --> 02:00:56,996
relationship, so I
can factor that out.

2764
02:00:57,236 --> 02:00:59,516
So, now the issue of primary
key should maybe make a little

2765
02:00:59,516 --> 02:01:00,026
more sense.

2766
02:01:00,026 --> 02:01:03,946
Now, we can have a zips table
whose primary key is zip code

2767
02:01:04,206 --> 02:01:06,886
and whose other columns are
city and state and so forth.

2768
02:01:07,616 --> 02:01:10,556
We can then have my user's
table and we add another column

2769
02:01:10,556 --> 02:01:13,366
to called zip and what do
I store in the zip field?

2770
02:01:13,366 --> 02:01:16,906
Well, something like 02138,
at that point in the story,

2771
02:01:16,906 --> 02:01:21,816
02138 or specifically
zip code is a primary key

2772
02:01:21,816 --> 02:01:26,566
in the zip's table and it's
what's called a foreign key

2773
02:01:26,716 --> 02:01:30,046
in my user's table, so this
too is why it's advantageous

2774
02:01:30,046 --> 02:01:31,566
to define these kinds of keys.

2775
02:01:31,846 --> 02:01:34,956
Primary key again means uniquely
identifies rows in this table.

2776
02:01:35,426 --> 02:01:39,496
Foreign key means, is a
primary key in another table

2777
02:01:39,746 --> 02:01:43,406
and what this will allow us to
do is take the user's table,

2778
02:01:43,756 --> 02:01:46,166
take the zip's table
and if I want to see all

2779
02:01:46,166 --> 02:01:47,436
of that data together, I want

2780
02:01:47,436 --> 02:01:50,216
to see jharvard crimson
02138 Cambridge, Mass.

2781
02:01:50,216 --> 02:01:52,586
I want to join all of
that information together

2782
02:01:52,776 --> 02:01:56,896
so I can get at it with
one associative array, one,

2783
02:01:56,896 --> 02:02:00,686
MySQL fetch assoc
call, what we can do is

2784
02:02:00,786 --> 02:02:05,256
if the right most column of this
user's table is zip and just

2785
02:02:05,256 --> 02:02:07,536
for the sake of pictures
the left most column of,

2786
02:02:07,816 --> 02:02:10,966
it's the opposite to you
guys, the left most column

2787
02:02:11,196 --> 02:02:14,436
of the zip's table is zip,
right so we have zips, zips,

2788
02:02:14,626 --> 02:02:17,976
we can effectively
overlap them and join them

2789
02:02:18,046 --> 02:02:20,276
so that now we have a
wider table that has all

2790
02:02:20,276 --> 02:02:21,896
of the data we care
about; redundantly

2791
02:02:21,896 --> 02:02:24,306
but at least now it's a
temporary table, so this refers

2792
02:02:24,306 --> 02:02:27,386
to generally the process of
normalization and factoring

2793
02:02:27,386 --> 02:02:30,176
out data that would otherwise
be redundant is not necessary

2794
02:02:30,176 --> 02:02:32,316
to keep and this is
the bread and butter

2795
02:02:32,316 --> 02:02:35,096
of relational databases; you
put as little information

2796
02:02:35,096 --> 02:02:37,866
as you need to solve the
problem in a given table,

2797
02:02:38,086 --> 02:02:41,026
you factor out as much as you
can and you leverage an feature

2798
02:02:41,026 --> 02:02:44,856
of SEQUEL known as joins to
actually rejoin the data later

2799
02:02:44,856 --> 02:02:47,416
which even though it costs
you a bit in CPU cycles,

2800
02:02:47,716 --> 02:02:50,526
saves you significantly
in space especially

2801
02:02:50,526 --> 02:02:53,356
when you have many,
many, many, users.

2802
02:02:53,866 --> 02:02:56,436
Alright, any questions?

2803
02:02:57,016 --> 02:03:02,146
Alright, why don't we go
ahead and call all on Louis.

2804
02:03:02,146 --> 02:03:02,356
>> Louis.

2805
02:03:02,776 --> 02:03:02,896
>> Louis.

2806
02:03:02,896 --> 02:03:03,746
>> When we're doing a project,

2807
02:03:03,746 --> 02:03:05,816
so PHPMyadmin that's
in the appliance?

2808
02:03:05,816 --> 02:03:06,076
>> Ah hum.

2809
02:03:09,816 --> 02:03:14,336
>> Is there any other, GUI that
we can use or do we have use

2810
02:03:14,336 --> 02:03:16,716
that because that's [inaudible]?

2811
02:03:16,716 --> 02:03:18,166
>> It's in the appliance
already,

2812
02:03:18,166 --> 02:03:23,456
if you're running Windows there
is a Windows when MySQL client

2813
02:03:23,456 --> 02:03:24,796
that you can download somewhere

2814
02:03:24,796 --> 02:03:28,886
on MySQL.com you would then
configure that Windows program

2815
02:03:28,936 --> 02:03:30,726
to talk to the IP
address of the appliance

2816
02:03:31,186 --> 02:03:33,626
with user name jhavard password
crimson and you could use

2817
02:03:33,626 --> 02:03:36,196
that as well, to
be honest, I would,

2818
02:03:36,196 --> 02:03:38,176
even though I have my
qualms with its UI,

2819
02:03:38,936 --> 02:03:41,766
it's actually a wonderful useful
tool that just gets the job done

2820
02:03:41,766 --> 02:03:43,766
and it's not again a
key part of the project,

2821
02:03:43,986 --> 02:03:46,556
it's just a user friendly
way of getting at the data

2822
02:03:46,766 --> 02:03:48,016
and creating tables and such.

2823
02:03:48,016 --> 02:03:50,966
>> We can use any methods
to create a database?

2824
02:03:50,966 --> 02:03:51,606
>> Yep, absolutely.

2825
02:03:51,606 --> 02:03:53,506
>> PHPMyadmin and stuff?

2826
02:03:53,686 --> 02:03:54,366
>> So, exactly.

2827
02:03:54,366 --> 02:03:55,586
So, at the end well and actually

2828
02:03:55,636 --> 02:03:57,676
for Project Zero you
don't need MySQL at all,

2829
02:03:57,676 --> 02:03:59,006
you don't need PHPMydmin at all,

2830
02:03:59,006 --> 02:04:01,196
so this will only be relevant
next week for Project One

2831
02:04:01,196 --> 02:04:04,226
which will actually use MySQL,
but at that point you're welcome

2832
02:04:04,226 --> 02:04:05,836
to use any development
environment you want,

2833
02:04:05,836 --> 02:04:07,886
any tools that you
want, so long as,

2834
02:04:07,886 --> 02:04:10,696
as we say in the spec
your code works properly

2835
02:04:10,696 --> 02:04:13,176
when you install it in the
appliance so that when we

2836
02:04:13,176 --> 02:04:14,966
and the teaching staff
install it in our appliance,

2837
02:04:15,286 --> 02:04:17,526
it's guaranteed to
work and it's not tied

2838
02:04:17,526 --> 02:04:20,776
to your random PC configuration
or Mac configuration,

2839
02:04:21,086 --> 02:04:23,306
there's at least a
standard installation setup.

2840
02:04:23,906 --> 02:04:27,746
Let's adjourn there.

2841
02:04:27,746 --> 02:04:29,066
I'll stick around for
one on one questions,

2842
02:04:29,066 --> 02:04:31,436
otherwise Alon will start with
section in just a little bit.

2843
02:04:31,436 --> 02:04:31,916
See you on Monday.

