1
00:00:00,506 --> 00:00:09,056
[ Silence ]

2
00:00:09,556 --> 00:00:10,156
>> All right.

3
00:00:10,626 --> 00:00:11,536
Apologies for the delay.

4
00:00:12,416 --> 00:00:15,546
Welcome to Computer Science
S-75, Building Dynamic Websites.

5
00:00:15,636 --> 00:00:16,496
My name is David.

6
00:00:16,496 --> 00:00:17,946
I'll be your instructor
this summer.

7
00:00:18,316 --> 00:00:20,156
And it's a pretty brief summer.

8
00:00:20,156 --> 00:00:22,646
So we're going to dive right
in tonight to some material,

9
00:00:22,646 --> 00:00:26,826
then we'll take a breathe, look
at the structure of the course,

10
00:00:26,826 --> 00:00:28,616
expectations thereof,
and then conclude

11
00:00:28,616 --> 00:00:29,906
with some additional material.

12
00:00:30,136 --> 00:00:31,446
And along the way,
please interject

13
00:00:31,446 --> 00:00:32,796
with any questions
that you might have.

14
00:00:32,846 --> 00:00:33,896
But first some questions
from me.

15
00:00:34,246 --> 00:00:38,306
So, you go ahead on the internet
on your laptop or desktop.

16
00:00:38,306 --> 00:00:39,656
You pull up your
favorite browser,

17
00:00:39,656 --> 00:00:44,206
you type in www.google.com
and hit enter, what happens?

18
00:00:44,626 --> 00:00:47,656
Let's tell the story and we can
be as high level or low level

19
00:00:47,656 --> 00:00:49,666
as we want and I'll steer
us in both directions.

20
00:00:49,666 --> 00:00:51,386
So you've hit enter,
what happens?

21
00:00:51,636 --> 00:00:53,456
Give me anything you got.

22
00:00:53,546 --> 00:00:53,676
Yeah?

23
00:00:53,936 --> 00:00:55,246
>> Well, first the request
is sent through your modem

24
00:00:55,246 --> 00:00:58,526
to your internet
service provider

25
00:00:58,756 --> 00:01:00,916
to wherever the Google
website is stored

26
00:01:00,916 --> 00:01:02,446
and then the information
is sent back.

27
00:01:03,036 --> 00:01:03,746
>> Oh good, OK.

28
00:01:03,746 --> 00:01:04,516
So that's the whole story.

29
00:01:04,576 --> 00:01:05,236
So that's very good.

30
00:01:05,576 --> 00:01:07,046
Let's tease it apart
a little bit now.

31
00:01:07,046 --> 00:01:08,606
And I'll repeat some of
the answers sometimes

32
00:01:08,666 --> 00:01:11,566
into the microphone so that our
folks who are taking the course

33
00:01:11,566 --> 00:01:13,776
from afar can hear everything.

34
00:01:13,776 --> 00:01:16,976
So your computer makes a request
to your-- through your modem,

35
00:01:16,976 --> 00:01:19,606
goes to your ISP,
reaches google.com servers

36
00:01:19,606 --> 00:01:21,296
and they reply with a response.

37
00:01:21,296 --> 00:01:24,166
So good, now let's dive in
deeper there, and let's focus

38
00:01:24,376 --> 00:01:28,166
on the active hitting enter to
someone want to propose what--

39
00:01:28,166 --> 00:01:30,526
give me just one step, in
more technical detail what

40
00:01:30,526 --> 00:01:31,296
happens next.

41
00:01:31,526 --> 00:01:33,576
And then we'll get to that
same end point eventually.

42
00:01:34,516 --> 00:01:37,856
[ Inaudible Remark ]

43
00:01:38,356 --> 00:01:41,016
Perfect. So we first need to
translate the name of the site,

44
00:01:41,016 --> 00:01:44,216
in this case www.google.com,
into an IP address.

45
00:01:44,346 --> 00:01:47,706
And someone else,
what is an IP address?

46
00:01:47,886 --> 00:01:47,976
Yeah.

47
00:01:48,356 --> 00:01:52,556
>> Well, it identifies
like a server or?

48
00:01:52,556 --> 00:01:53,736
>> OK, good.

49
00:01:53,736 --> 00:01:56,506
So an IP address, it
identifies a server

50
00:01:56,506 --> 00:01:57,706
or computer on the internet.

51
00:01:57,706 --> 00:02:01,416
And an IP address is simply
a number of this form.

52
00:02:01,416 --> 00:02:02,246
Let me go ahead and pull

53
00:02:02,246 --> 00:02:04,776
up a little scratch
pad for notes here.

54
00:02:05,286 --> 00:02:10,236
So, an IP address, as you've
probably seen, is something

55
00:02:10,236 --> 00:02:14,526
of the form w.x.y.z. And
little internet trivia,

56
00:02:14,926 --> 00:02:18,196
each of these placeholders can
be a digit from what to what?

57
00:02:18,916 --> 00:02:19,946
Or number from what to what?

58
00:02:21,186 --> 00:02:21,366
Sure.

59
00:02:21,366 --> 00:02:22,236
>> Zero to 255.

60
00:02:23,006 --> 00:02:23,366
>> Perfect.

61
00:02:23,366 --> 00:02:24,446
Zero to 255.

62
00:02:24,446 --> 00:02:26,616
And there's some restrictions
on what numbers can be where

63
00:02:26,766 --> 00:02:29,026
but essentially you have number
dot number dot number dot

64
00:02:29,026 --> 00:02:31,156
number, and each of those
numbers can be again zero

65
00:02:31,156 --> 00:02:32,376
through 255.

66
00:02:32,586 --> 00:02:35,956
And if we really want to start
pressing deeper here how many

67
00:02:35,956 --> 00:02:38,676
bits is used to represent
an entire IP address

68
00:02:38,916 --> 00:02:39,646
under the schema?

69
00:02:40,236 --> 00:02:44,956
For those familiar
with bits, 32.

70
00:02:44,956 --> 00:02:45,836
So why is that?

71
00:02:45,836 --> 00:02:48,736
Well for those less
familiar, unfamiliar,

72
00:02:48,736 --> 00:02:51,746
if you want to represent
the number zero through 255,

73
00:02:51,796 --> 00:02:54,936
which is a total of 256
numbers, you need 8 bits,

74
00:02:55,266 --> 00:02:57,216
because 2 to the 8th is 256.

75
00:02:57,216 --> 00:02:59,896
But we won't go into too much
detail on along those lines.

76
00:03:00,156 --> 00:03:03,046
But if you've seen that IP
addresses are just 32 bits,

77
00:03:03,596 --> 00:03:06,806
it is because each of these
numbers is 8 bits itself.

78
00:03:07,076 --> 00:03:09,606
So actually, let's go here,
there won't be much math

79
00:03:09,606 --> 00:03:11,466
in this course after the
following sentence really.

80
00:03:11,856 --> 00:03:13,826
But if you have 32 bits,

81
00:03:14,106 --> 00:03:16,286
how many possible IP
addresses are there

82
00:03:16,316 --> 00:03:17,636
for the world's computers?

83
00:03:21,236 --> 00:03:21,346
Yeah.

84
00:03:22,016 --> 00:03:23,196
[ Inaudible Remark ]

85
00:03:23,196 --> 00:03:25,346
So it's 2 to the 32nd
which is roughly?

86
00:03:25,346 --> 00:03:26,776
Those who are good with
math in their heads?

87
00:03:28,236 --> 00:03:29,786
So it's roughly 4 billion.

88
00:03:29,876 --> 00:03:33,666
So that's a lot but these
days most of you have laptops,

89
00:03:33,666 --> 00:03:38,556
most of you have desktops,
most of you have telephones

90
00:03:38,556 --> 00:03:40,766
in your pockets or
iPads or the like.

91
00:03:40,766 --> 00:03:42,326
So there's more and
more devices these days

92
00:03:42,326 --> 00:03:43,646
that are consuming IP addresses.

93
00:03:43,906 --> 00:03:46,996
So if you follow the popular
media of late you'll find

94
00:03:46,996 --> 00:03:48,686
that people have been
freaking out that we're

95
00:03:48,686 --> 00:03:50,226
about to run out
of IP addresses.

96
00:03:50,226 --> 00:03:51,956
But that's because we've
been using version 4

97
00:03:51,956 --> 00:03:52,726
for far too long.

98
00:03:52,976 --> 00:03:55,376
Thankfully, version 6 has
begun to get rolled out

99
00:03:55,706 --> 00:04:00,286
and version 6 will have 128 bit
IP addresses, which is great

100
00:04:00,286 --> 00:04:02,776
because that's 2 to
the 128, which is huge,

101
00:04:02,846 --> 00:04:03,976
barely pronounceable,

102
00:04:04,266 --> 00:04:06,756
but it will also become
a little more complex

103
00:04:06,756 --> 00:04:08,056
to write these things down.

104
00:04:08,056 --> 00:04:10,456
So we can squeeze a few
more years of discussion

105
00:04:10,456 --> 00:04:11,306
out of these addresses

106
00:04:11,306 --> 00:04:13,366
but realize the world
is transitioning.

107
00:04:13,546 --> 00:04:15,746
Now just for the sake of the
experience for those at home.

108
00:04:15,746 --> 00:04:18,316
Let me actually pause
here, just so we can plug

109
00:04:18,316 --> 00:04:20,866
in this recording device so we
can capture to another format.

110
00:04:20,956 --> 00:04:23,336
So let's leave that as a
cliffhanger for just a minute

111
00:04:23,606 --> 00:04:25,926
or two and I'll be right back.

112
00:04:26,186 --> 00:04:27,976
So, where do we leave off?

113
00:04:27,976 --> 00:04:29,906
You'd just hit enter,
we had proposed

114
00:04:29,906 --> 00:04:32,266
that your computer had
translated or needed

115
00:04:32,266 --> 00:04:36,156
to translate the host
name www.google.com

116
00:04:36,186 --> 00:04:37,726
into an IP address and
then we talked for a moment

117
00:04:37,726 --> 00:04:39,996
about various forms
of IP addresses.

118
00:04:40,186 --> 00:04:41,616
So let's now push
a little harder

119
00:04:41,616 --> 00:04:43,456
on how this translation happens.

120
00:04:43,656 --> 00:04:46,046
So Google has a numeric
address of this form.

121
00:04:46,156 --> 00:04:48,936
And as an aside Google actually
has probably a whole bunch

122
00:04:48,986 --> 00:04:50,136
of IP address of that form.

123
00:04:50,136 --> 00:04:52,306
All of which lead to
the same experience

124
00:04:52,496 --> 00:04:53,986
but perhaps different servers.

125
00:04:54,316 --> 00:04:56,716
So how does your
little Mac or PC

126
00:04:56,866 --> 00:04:59,556
or Linux computer know
what the IP address

127
00:04:59,556 --> 00:05:02,146
of www.google.com actually is?

128
00:05:02,626 --> 00:05:06,696
>> It has to do with a domain
name lookup, the DNS servers.

129
00:05:06,696 --> 00:05:07,186
>> OK. Good.

130
00:05:07,186 --> 00:05:09,876
So it has to do with domain
name lookup using a DNS server.

131
00:05:09,876 --> 00:05:12,766
So for those unfamiliar
DNS is Domain Name System.

132
00:05:12,986 --> 00:05:14,606
And this is an infrastructure
on the internet

133
00:05:14,606 --> 00:05:16,196
that pretty much
does exactly that.

134
00:05:16,196 --> 00:05:18,526
It converts domain
names and host names

135
00:05:18,836 --> 00:05:21,186
to IP addresses and vice versa.

136
00:05:21,186 --> 00:05:23,846
And we'll see tonight that it
does a few other things in terms

137
00:05:23,976 --> 00:05:27,376
of helping with rerouting
of email, with validation

138
00:05:27,376 --> 00:05:29,506
of ownership of domains
and the like.

139
00:05:29,666 --> 00:05:31,026
So there are these
servers out there.

140
00:05:31,026 --> 00:05:33,716
Now your computer, your home
probably doesn't have its own

141
00:05:33,716 --> 00:05:36,996
DNS server but probably
Harvard does if you're on campus

142
00:05:36,996 --> 00:05:40,166
or Comcast does or Verizon
or your company does.

143
00:05:40,426 --> 00:05:42,936
Now if you're at a small
college, for instance,

144
00:05:42,936 --> 00:05:44,426
and you're not visiting
google.com

145
00:05:44,586 --> 00:05:47,056
but you're visiting
somerandomwebsite.com,

146
00:05:47,056 --> 00:05:49,466
it's very possible that you are
the first person on that campus

147
00:05:49,466 --> 00:05:52,546
to visit that website ever
or at least in a long time.

148
00:05:52,906 --> 00:05:57,616
So what if you're small little
campus's DNS server has no idea

149
00:05:57,616 --> 00:05:59,006
what this IP address is?

150
00:05:59,496 --> 00:06:02,066
Are you sort of out of luck
because you went to that school

151
00:06:02,066 --> 00:06:04,456
and not one where there's more
people using that websites?

152
00:06:05,286 --> 00:06:07,086
Or equivalently it's kind of
a chicken and the egg problem.

153
00:06:07,086 --> 00:06:09,576
If you're the first person to
ever need to visit that website

154
00:06:09,576 --> 00:06:12,056
and therefore your campus's
DNS server has no idea what

155
00:06:12,056 --> 00:06:16,556
that mapping is, how do
you solve this problem?

156
00:06:16,656 --> 00:06:16,746
Yeah.

157
00:06:17,516 --> 00:06:21,836
[ Inaudible Remark ]

158
00:06:22,336 --> 00:06:24,866
Exactly, so there's a
hierarchy, thankfully,

159
00:06:24,866 --> 00:06:28,166
to the DNS system whereby even
though you might have your own

160
00:06:28,166 --> 00:06:30,096
DNS server on campus or company

161
00:06:30,386 --> 00:06:33,196
but that doesn't necessarily
store all possible domain names

162
00:06:33,196 --> 00:06:34,536
and IP addresses in the world.

163
00:06:34,536 --> 00:06:37,086
In fact, that would be quite
a large database otherwise,

164
00:06:37,086 --> 00:06:38,696
and it's just not
efficient to keep all of them

165
00:06:38,696 --> 00:06:39,996
around if they're
not being accessed

166
00:06:39,996 --> 00:06:41,756
at all or very frequently.

167
00:06:42,016 --> 00:06:45,206
But your ISP knows some
bigger fish and maybe

168
00:06:45,206 --> 00:06:47,146
that bigger fish knows
an even bigger fish

169
00:06:47,146 --> 00:06:49,626
that has its own DNS
servers that might know.

170
00:06:49,626 --> 00:06:52,856
But in the worst, if no one
along this hierarchy knows,

171
00:06:52,856 --> 00:06:55,216
there also exists in the world
what are called root servers

172
00:06:55,486 --> 00:06:57,296
which are spread
out geographically

173
00:06:57,296 --> 00:07:00,076
across the several continents,
and it's those root servers

174
00:07:00,076 --> 00:07:03,936
that essentially know who does
know what the IP address is

175
00:07:03,936 --> 00:07:05,986
of somerandomwebsite.com.

176
00:07:06,186 --> 00:07:08,596
In other words, those
root servers know

177
00:07:08,826 --> 00:07:11,506
who the authority is for
instance for all of the dot coms

178
00:07:11,506 --> 00:07:14,386
in the world or all of the
dot nets or the like so

179
00:07:14,386 --> 00:07:16,516
that you can have
this initial request

180
00:07:16,516 --> 00:07:18,706
from little old your
computer bubble

181
00:07:18,706 --> 00:07:22,106
up to these very high level
servers and then bubble back

182
00:07:22,106 --> 00:07:25,126
down to some authority
who does actually know.

183
00:07:25,356 --> 00:07:26,946
And the reason for that--

184
00:07:27,216 --> 00:07:29,416
that that works is
because when you go

185
00:07:29,416 --> 00:07:32,706
and buy your own domain name,
which is a process we'll discuss

186
00:07:32,706 --> 00:07:36,906
in just a bit, you have to tell
the world what the IP address is

187
00:07:36,906 --> 00:07:38,786
of your DNS server.

188
00:07:38,786 --> 00:07:41,916
So someone has to be informed
proactively once really

189
00:07:42,106 --> 00:07:43,556
and only once when
you buy the domain.

190
00:07:43,786 --> 00:07:46,206
So for now, let's come back
to our story, we've hit enter,

191
00:07:46,446 --> 00:07:48,426
google.com was in
my browser's window,

192
00:07:48,426 --> 00:07:52,406
my computer has somehow
figured out that it is 1.2.3.4

193
00:07:52,406 --> 00:07:53,296
or something like that.

194
00:07:53,486 --> 00:07:56,366
So now my computer puts
together a message to send it

195
00:07:56,366 --> 00:07:58,526
across the internet
to google.com.

196
00:07:58,526 --> 00:07:59,686
What does that message
look like?

197
00:07:59,946 --> 00:08:01,816
Well, in it's simplest
form it's a message

198
00:08:01,966 --> 00:08:03,316
that pretty much
looks like this.

199
00:08:03,706 --> 00:08:06,606
It is literally the word
GET in all caps, a space,

200
00:08:06,606 --> 00:08:09,156
a forward slash, if you're
just requesting the root

201
00:08:09,316 --> 00:08:11,486
of the web server demarked
typically with slash.

202
00:08:12,056 --> 00:08:14,596
And then HTTP slash
version number.

203
00:08:14,766 --> 00:08:17,846
Now in reality there's a few
more headers, so to speak.

204
00:08:17,846 --> 00:08:20,856
HTTP headers that gets
sent from browser to server

205
00:08:20,856 --> 00:08:22,666
and we'll see those in
action in just a bit.

206
00:08:22,986 --> 00:08:26,796
But this message captures
really the most import aspect

207
00:08:26,796 --> 00:08:27,416
of the request.

208
00:08:27,876 --> 00:08:30,836
So your little computer
creates a virtual envelope,

209
00:08:30,836 --> 00:08:32,706
more technically called
a packet of some sort.

210
00:08:32,706 --> 00:08:34,946
Inside of that packet
is a message like this.

211
00:08:35,276 --> 00:08:37,756
I'll put on the front of
that virtual envelope is a

212
00:08:37,756 --> 00:08:41,886
to address namely 1.2.3.4 or
whatever Google's IP address is.

213
00:08:42,176 --> 00:08:44,546
In the return field of this
virtual envelope, you know,

214
00:08:44,546 --> 00:08:46,216
just like you were mailing
something to a human,

215
00:08:46,526 --> 00:08:48,186
there is the return
address who--

216
00:08:48,186 --> 00:08:50,186
which should be whose
IP address, probably?

217
00:08:51,506 --> 00:08:54,406
So your own IP address and
your computer does know

218
00:08:54,406 --> 00:08:55,936
that if you have an
internet connection.

219
00:08:56,236 --> 00:08:58,736
And then your computer sends
it out on the internet.

220
00:08:58,866 --> 00:09:00,936
Now we can dive deeper
and deeper and deeper

221
00:09:00,936 --> 00:09:05,086
but for now assume that your
ISP has what's called a default

222
00:09:05,086 --> 00:09:07,006
gateway, a.k.a router.

223
00:09:07,226 --> 00:09:09,496
And routers are the computers
on the internet that know how

224
00:09:09,496 --> 00:09:11,376
to get data from
point A to point B

225
00:09:11,376 --> 00:09:14,326
or if they don't know
precisely how to go from A to B,

226
00:09:14,546 --> 00:09:16,976
they know whom to
pass it off to,

227
00:09:17,176 --> 00:09:19,806
who can then get it one
step closer to point B.

228
00:09:20,066 --> 00:09:23,596
So in reality a packet, this
virtual envelope might go

229
00:09:23,596 --> 00:09:25,886
from router to router to router
to router, sometimes as many

230
00:09:25,886 --> 00:09:28,666
as many 30 different
routers across the globe

231
00:09:28,896 --> 00:09:32,196
until finally it gets to its
actual destination, google.com.

232
00:09:32,196 --> 00:09:34,756
Google receives this virtual
envelope, sees that it's

233
00:09:34,756 --> 00:09:38,406
for its IP address, opens the
envelope up, sees this message.

234
00:09:38,616 --> 00:09:41,736
google.com server happens to
be running a web server and so

235
00:09:41,736 --> 00:09:44,136
that web server looks for
the file called slash.

236
00:09:44,686 --> 00:09:47,596
Now slash is typically a
synonym for an actual filename

237
00:09:47,596 --> 00:09:51,106
like index.html or
index.php or any number

238
00:09:51,106 --> 00:09:53,506
of other default
standard filenames.

239
00:09:53,706 --> 00:09:56,936
So Google grabs that file from
its hard drive and then puts it

240
00:09:56,936 --> 00:10:00,536
in its own virtual envelope,
flips the two IP addresses,

241
00:10:00,536 --> 00:10:02,276
the from and the
sender, sends it back

242
00:10:02,496 --> 00:10:03,706
to the internet via
these routers.

243
00:10:04,136 --> 00:10:06,206
It arrives on my computer.

244
00:10:06,206 --> 00:10:08,716
I have my computer, unbeknownst
to me, opens this envelope,

245
00:10:08,876 --> 00:10:10,946
sees a whole bunch of
language called HTML,

246
00:10:11,176 --> 00:10:15,166
renders that HTML top to bottom
and I see the search page

247
00:10:15,226 --> 00:10:16,966
for Google's main site.

248
00:10:16,966 --> 00:10:17,046
Yeah?

249
00:10:17,466 --> 00:10:22,356
>> What is the function
of the slash?

250
00:10:22,426 --> 00:10:23,766
>> What is the function
of the slash?

251
00:10:24,106 --> 00:10:27,516
So whenever you type
in a URL, you--

252
00:10:27,516 --> 00:10:29,506
there are several
different components to it.

253
00:10:29,506 --> 00:10:34,676
Http typically followed by ://
followed by something like this,

254
00:10:35,146 --> 00:10:37,136
and so this is let's say
a representative URL.

255
00:10:37,136 --> 00:10:39,556
But we can actually tease this
apart into a few components.

256
00:10:39,966 --> 00:10:42,776
This is the protocol or
schema at the beginning.

257
00:10:43,036 --> 00:10:47,366
Even though in a browser we
almost always use http://.

258
00:10:47,366 --> 00:10:48,596
Have folks seen others?

259
00:10:49,626 --> 00:10:49,746
Yeah?

260
00:10:50,486 --> 00:10:50,666
>> HTTPS.

261
00:10:50,896 --> 00:10:56,266
>> HTTPS similar but different
in that it uses cryptography,

262
00:10:56,266 --> 00:10:59,006
a topic we'll come back to.

263
00:10:59,056 --> 00:10:59,256
Yeah?

264
00:10:59,256 --> 00:11:00,606
>> Ftp://.

265
00:11:00,606 --> 00:11:03,586
>> Ftp://, sftp://, webcal://,

266
00:11:03,586 --> 00:11:05,716
some of these are more
standardized than others

267
00:11:06,006 --> 00:11:08,476
but the schema is typically
an indicator to some piece

268
00:11:08,476 --> 00:11:12,476
of software how it should view
the contents at that address.

269
00:11:12,946 --> 00:11:14,576
So what comes after the ://?

270
00:11:14,926 --> 00:11:17,166
You typically have
something called a host name

271
00:11:17,336 --> 00:11:20,106
or subdomain name
followed by the domain name

272
00:11:20,106 --> 00:11:21,586
which in this case
is google.com,

273
00:11:21,916 --> 00:11:25,326
or followed more precisely
by a domain name with a TLD,

274
00:11:25,466 --> 00:11:31,906
top-level domain, a .com, .edu,
.gov, .uk would be the TLD.

275
00:11:31,906 --> 00:11:34,346
And then you have
what we call a path.

276
00:11:34,476 --> 00:11:37,016
And a path specifies
exactly what file

277
00:11:37,016 --> 00:11:38,476
or folder you want to access.

278
00:11:38,766 --> 00:11:42,686
So a single slash means get
me the root of my hard drive

279
00:11:42,966 --> 00:11:44,746
and if you come from
the Windows world,

280
00:11:44,746 --> 00:11:47,086
this is essentially
equivalent to c:

281
00:11:47,396 --> 00:11:49,256
or on a Mac it's
equivalent to that

282
00:11:49,256 --> 00:11:51,106
or on Linux computer
it's equivalent to that.

283
00:11:51,576 --> 00:11:53,666
So that is truly the
root of your hard drive,

284
00:11:53,966 --> 00:11:57,376
the folder in which everything
else on your hard drive lives.

285
00:11:58,256 --> 00:12:00,806
Now, it turns out you--
in a browser these days,

286
00:12:00,806 --> 00:12:02,796
you don't have to
type most of that.

287
00:12:02,796 --> 00:12:05,276
You can omit the HTTP, you
can typically omit the www,

288
00:12:05,276 --> 00:12:08,616
you can omit the slash
and things just work.

289
00:12:08,906 --> 00:12:09,626
Why is that?

290
00:12:09,626 --> 00:12:10,536
Well, for the most part,

291
00:12:10,536 --> 00:12:12,956
it's because browsers have just
gotten a lot more user friendly,

292
00:12:13,266 --> 00:12:13,436
right.

293
00:12:13,436 --> 00:12:16,796
There was a time, a few years
ago where advertisements

294
00:12:16,946 --> 00:12:20,436
in prints and on TV would
actually have http://

295
00:12:20,436 --> 00:12:22,866
but then the world kind
of realized that you know,

296
00:12:22,866 --> 00:12:26,226
anytime you see www something,
it's probably a website

297
00:12:26,276 --> 00:12:29,346
so we started omitting http://.

298
00:12:29,596 --> 00:12:31,726
Now the world has gotten
acclimated to any mention

299
00:12:31,726 --> 00:12:36,236
of .com or .gov so we don't even
really need the www anymore.

300
00:12:36,236 --> 00:12:39,056
And so, whether or not www works

301
00:12:39,056 --> 00:12:42,006
or doesn't work is actually
completely configurable

302
00:12:42,006 --> 00:12:43,966
by the system administrators
of a website.

303
00:12:44,226 --> 00:12:48,946
And in fact, I don't have a sort
of a soapbox to hop on right now

304
00:12:48,946 --> 00:12:50,956
but invariably during
a semester,

305
00:12:50,956 --> 00:12:54,136
I'll come across some
website for which foo.com

306
00:12:54,136 --> 00:12:57,076
or whatever their domain
is .com just doesn't work,

307
00:12:57,076 --> 00:13:00,946
you have to type in
www.something.com

308
00:13:01,166 --> 00:13:03,646
and that's just a foolish
technical design decision

309
00:13:03,646 --> 00:13:04,176
on their part.

310
00:13:04,176 --> 00:13:06,726
We'll talk today about how
you can configure things

311
00:13:06,726 --> 00:13:07,576
to just work.

312
00:13:07,576 --> 00:13:10,896
And it involves a bit of DNS, a
bit of web server configuration

313
00:13:11,216 --> 00:13:14,096
but typically, you
don't see that dead end

314
00:13:14,346 --> 00:13:17,246
because browsers these
days, if you type in foo.com

315
00:13:17,246 --> 00:13:20,586
and hit enter and there is no
foo.com IP address out there,

316
00:13:20,796 --> 00:13:24,986
the browser will presumptuously
or helpfully prepen www

317
00:13:24,986 --> 00:13:28,906
to the start of the address
and then retry that one.

318
00:13:29,106 --> 00:13:30,716
Some browsers, if
you just type foo,

319
00:13:30,716 --> 00:13:33,006
will automatically
try .com, .net,

320
00:13:33,006 --> 00:13:34,586
.gov some of the
most popular ones.

321
00:13:34,586 --> 00:13:36,926
So in short, a lot of
the technical processes

322
00:13:36,926 --> 00:13:39,516
that are happening are
being sort of hidden now

323
00:13:39,516 --> 00:13:42,816
by browser user friendliness
for better or for worse.

324
00:13:44,046 --> 00:13:46,836
So, the story began with
hitting enter, the story ended

325
00:13:46,836 --> 00:13:48,726
with your seeing the
homepage of Google.

326
00:13:49,196 --> 00:13:51,676
Any questions on the
various steps in between,

327
00:13:51,896 --> 00:13:53,336
whether high level or low level?

328
00:13:53,866 --> 00:13:59,566
Right. So that's the story told
from the perspective of a user.

329
00:13:59,916 --> 00:14:03,686
Why don't we tell the story from
the perspective now of someone

330
00:14:03,686 --> 00:14:07,006
who owns a website or
wants to operate a website?

331
00:14:07,006 --> 00:14:09,676
So suppose one of your goals
in this class or some other is

332
00:14:09,816 --> 00:14:12,376
to actually have your
own presence on the web.

333
00:14:12,676 --> 00:14:14,696
When you actually buy
your own domain name

334
00:14:14,696 --> 00:14:16,716
and have your own business
or personal home page

335
00:14:16,716 --> 00:14:19,856
or whatever the case may be,
how do you go about doing that?

336
00:14:20,046 --> 00:14:21,906
You need more than just
a laptop and a browser,

337
00:14:21,906 --> 00:14:23,956
now you need a server
on the internet

338
00:14:23,956 --> 00:14:26,126
because even though every
computer on the internet,

339
00:14:26,126 --> 00:14:28,796
your laptop included,
has an IP address,

340
00:14:29,146 --> 00:14:31,336
it's not necessarily
publicly accessible.

341
00:14:31,336 --> 00:14:34,066
Because even that statement's
a bit of an oversimplification.

342
00:14:34,556 --> 00:14:37,686
You do not necessarily
have a public IP address.

343
00:14:37,826 --> 00:14:40,626
In fact if you go home and you
have internet access at home,

344
00:14:40,626 --> 00:14:43,136
especially wireless, you
probably have a home router

345
00:14:43,136 --> 00:14:46,806
like an Apple AirPort Extreme
or you have a Linksys router

346
00:14:46,806 --> 00:14:48,276
or some device with antennas

347
00:14:48,316 --> 00:14:50,406
that gives you wireless
internet access.

348
00:14:50,786 --> 00:14:53,456
But Comcast or Verizon or
whoever you're paying each month

349
00:14:53,496 --> 00:14:56,476
to give you internet access into
the house via your cable modem

350
00:14:56,476 --> 00:14:59,256
or DSL modem which in
turn is probably connected

351
00:14:59,256 --> 00:15:00,436
to that router.

352
00:15:00,646 --> 00:15:02,136
If it's not one in
the same device,

353
00:15:02,136 --> 00:15:03,786
which some of the
ISPs provide these all

354
00:15:03,786 --> 00:15:07,576
in one devices these days, odds
are you have one IP address.

355
00:15:07,896 --> 00:15:11,826
And if you have three
brothers and sisters or parents

356
00:15:11,826 --> 00:15:13,456
or grandkids in the house,

357
00:15:13,796 --> 00:15:17,076
all of you are sharing
that one IP address.

358
00:15:17,076 --> 00:15:18,616
And yet the individual computers

359
00:15:18,616 --> 00:15:20,376
in the home still
need an IP address

360
00:15:20,706 --> 00:15:22,706
so what actually is the
case is that when you're

361
00:15:22,706 --> 00:15:26,076
in a home network, you have a--

362
00:15:26,076 --> 00:15:28,216
what's called generally
a private IP address.

363
00:15:28,216 --> 00:15:29,316
Something of the form--

364
00:15:29,316 --> 00:15:31,866
anyone what a popular
internal IP address is?

365
00:15:32,081 --> 00:15:34,081
[ Inaudible Remark ]

366
00:15:34,146 --> 00:15:34,816
Yeah. Exactly.

367
00:15:35,716 --> 00:15:37,506
Anything in fact starting

368
00:15:37,506 --> 00:15:42,626
with 192.168 dot something
dot something is a private

369
00:15:42,626 --> 00:15:43,316
IP address.

370
00:15:43,316 --> 00:15:46,456
So the folks who invented the
internet along the way decided,

371
00:15:46,456 --> 00:15:46,846
"You know what?

372
00:15:46,846 --> 00:15:48,676
We should probably
have some IP addresses

373
00:15:48,716 --> 00:15:51,686
that should never be given
out so that within a company

374
00:15:51,686 --> 00:15:53,566
or a home or a little
test network,

375
00:15:53,746 --> 00:15:56,496
you can have IP addresses that
are guaranteed not to exist

376
00:15:56,496 --> 00:15:57,406
on the public internet."

377
00:15:57,766 --> 00:16:02,106
So what home routers typically
use is 192.168.0 or .1

378
00:16:02,366 --> 00:16:05,746
and then the last digit can be
again, between zero and 255.

379
00:16:05,806 --> 00:16:08,236
But some exceptions, it
really-- it can't be zero,

380
00:16:08,446 --> 00:16:11,156
can't usually be 255 so
there are some constraints

381
00:16:11,156 --> 00:16:14,736
but it gives you roughly 250
or so possible IP addresses.

382
00:16:14,736 --> 00:16:18,846
If you don't like that,
there's 172.16 dot something

383
00:16:18,846 --> 00:16:19,506
dot something.

384
00:16:19,506 --> 00:16:21,576
There's a few more
constraints on this one,

385
00:16:21,796 --> 00:16:24,916
but then if you really need a
lot of internal IP addresses,

386
00:16:25,236 --> 00:16:28,196
you can have a what's called
a class A private network,

387
00:16:28,506 --> 00:16:30,946
10 dot anything is
a private address.

388
00:16:31,206 --> 00:16:34,096
And this actually gives you
millions of IP addresses

389
00:16:34,096 --> 00:16:36,916
for your home or your
business or your data center.

390
00:16:36,916 --> 00:16:39,506
But in short, any IP
addresses beginning with these

391
00:16:39,506 --> 00:16:41,696
and a few other prefixes
are considered private

392
00:16:41,986 --> 00:16:44,706
but the problem then is that
even if after this class,

393
00:16:44,706 --> 00:16:48,536
you know, HTML and CSS all the
better, you know PHP and SQL

394
00:16:48,536 --> 00:16:52,326
and JavaScript and you create
a website and you run it

395
00:16:52,326 --> 00:16:54,676
on your laptop using software
we'll introduce you to,

396
00:16:54,676 --> 00:16:57,926
a web server called Apache,
no one in the world is going

397
00:16:57,926 --> 00:16:59,746
to be able to visit your website

398
00:16:59,856 --> 00:17:02,686
because your address
probably starts with one

399
00:17:02,686 --> 00:17:07,046
of theses prefixes and your
home router or cable modem

400
00:17:07,046 --> 00:17:10,586
or DSL modem is not going
to let outside random people

401
00:17:10,586 --> 00:17:15,416
into your home network to access
this IP address because frankly,

402
00:17:15,416 --> 00:17:17,226
there's tens of thousands
of people who probably have

403
00:17:17,226 --> 00:17:18,896
that exact same private
IP address

404
00:17:18,896 --> 00:17:20,876
so it's just not
uniquely identifiable.

405
00:17:21,096 --> 00:17:22,836
And because your home router

406
00:17:23,046 --> 00:17:25,326
and your cable modem
is sometimes a firewall

407
00:17:25,326 --> 00:17:27,406
onto itself, this traffic is
not going to get into your home.

408
00:17:27,736 --> 00:17:30,086
So in short, that
won't work but you have

409
00:17:30,086 --> 00:17:32,116
at least two options,
two alternatives.

410
00:17:32,116 --> 00:17:34,796
How can you get your
websites out on the internet?

411
00:17:35,096 --> 00:17:37,766
>> Well, if you're still trying
to leave it on your own network

412
00:17:37,766 --> 00:17:42,006
if you port forward to
your own private IP address

413
00:17:42,126 --> 00:17:43,346
from your public IP address.

414
00:17:43,476 --> 00:17:43,726
>> You can.

415
00:17:43,986 --> 00:17:44,576
Port forwarding.

416
00:17:44,576 --> 00:17:46,616
So let's go there,
for those unfamiliar.

417
00:17:46,616 --> 00:17:49,096
When you use a protocol
like HTTP,

418
00:17:49,096 --> 00:17:53,906
you're actually using other
protocols behind the scenes

419
00:17:54,046 --> 00:17:58,526
and in fact you've probably at
least heard the buzzword TCP/IP,

420
00:17:58,526 --> 00:18:01,036
Transmission Control
Protocol/Internet Protocol.

421
00:18:01,156 --> 00:18:02,266
It's actually two protocols.

422
00:18:02,266 --> 00:18:04,786
Two different standards
or languages, so to speak,

423
00:18:05,046 --> 00:18:08,456
that govern how data can be
transmitted on the internet.

424
00:18:08,456 --> 00:18:10,036
And this is a bit of
an oversimplification,

425
00:18:10,036 --> 00:18:13,636
but for today's purposes assume
that IP, the internet protocol,

426
00:18:13,906 --> 00:18:16,136
is just a set of
conventions that humans came

427
00:18:16,136 --> 00:18:20,146
up with years ago that govern
how you associate numeric

428
00:18:20,146 --> 00:18:21,646
addresses with computers.

429
00:18:21,786 --> 00:18:23,966
So IP address derives
from this protocol.

430
00:18:24,136 --> 00:18:27,326
So IP is just the standard for
assigning computers addresses.

431
00:18:27,696 --> 00:18:31,076
However, just assigning someone
an address doesn't mean you can

432
00:18:31,076 --> 00:18:32,756
get data to that address.

433
00:18:32,756 --> 00:18:35,976
For that you need another
standard, another protocol

434
00:18:35,976 --> 00:18:38,916
and that's typically TCP,
Transmission Control Protocol.

435
00:18:39,166 --> 00:18:44,136
So TCP is the standard that web
browsers and web servers speak

436
00:18:44,316 --> 00:18:46,896
in order to actually
physically move data

437
00:18:46,896 --> 00:18:49,416
or electronically
move data from point A

438
00:18:49,416 --> 00:18:53,726
to point B using the higher
level notion of an IP address

439
00:18:53,836 --> 00:18:56,996
to actually uniquely
identify points A and point B.

440
00:18:57,246 --> 00:19:00,676
So for those who might want to
go further in computer science

441
00:19:00,676 --> 00:19:01,846
and in networking in particular,

442
00:19:02,066 --> 00:19:04,516
there's typically what's
called the TCP/IP stack.

443
00:19:04,676 --> 00:19:06,676
And so there's topics

444
00:19:06,676 --> 00:19:09,176
like there's the
transport layer down here.

445
00:19:09,176 --> 00:19:12,566
There's the IP or
addressing layer here,

446
00:19:12,756 --> 00:19:14,026
there's the application layer.

447
00:19:14,026 --> 00:19:16,096
In short, much of the
internet is the result

448
00:19:16,096 --> 00:19:18,626
of smart people having design
things and then design things

449
00:19:18,626 --> 00:19:20,106
on top of thing on top of things

450
00:19:20,446 --> 00:19:23,596
and so we just typically
oversimplify and say TCP/IP.

451
00:19:24,176 --> 00:19:25,206
So what's the point there?

452
00:19:25,706 --> 00:19:28,526
TCP/IP allows not
just the web to work

453
00:19:28,566 --> 00:19:30,886
but all sorts of applications.

454
00:19:30,886 --> 00:19:33,836
There's the web, there's email,
there's instant messaging,

455
00:19:34,086 --> 00:19:37,226
there's-- I mean what else--
there's things like Spotify,

456
00:19:37,226 --> 00:19:39,756
there's dedicated applications
that are using the internet

457
00:19:40,056 --> 00:19:41,946
but aren't necessarily
inside of a browser.

458
00:19:42,206 --> 00:19:45,636
So a server can actually
do multiple things.

459
00:19:45,636 --> 00:19:47,876
It can receive email
like Gmail can.

460
00:19:47,876 --> 00:19:50,536
It can be a website
and get HTTP traffic.

461
00:19:50,866 --> 00:19:54,056
So a server, because it can do
multiple things, somehow needs

462
00:19:54,056 --> 00:19:57,596
to be able to uniquely
identify the various things

463
00:19:57,596 --> 00:19:58,366
that it can do.

464
00:19:58,586 --> 00:20:00,976
And so the world introduced
this notion of port numbers.

465
00:20:01,046 --> 00:20:06,036
And typically for a web
sever for rather for HTTP,

466
00:20:06,036 --> 00:20:08,336
it uses this protocol TCP

467
00:20:08,336 --> 00:20:12,036
and the world decided some
years ago the number 80 will

468
00:20:12,036 --> 00:20:15,426
arbitrarily but consistently
identify this service.

469
00:20:15,766 --> 00:20:18,516
So if you have a server
and you have a website.

470
00:20:18,786 --> 00:20:20,746
And a website uses,
as you probably know,

471
00:20:20,746 --> 00:20:22,996
HTTP but we'll look at
what that means in a bit.

472
00:20:23,706 --> 00:20:26,406
It is running, so to
speak, on port 80.

473
00:20:26,406 --> 00:20:28,366
It is listening, so
to speak, on port 80.

474
00:20:28,586 --> 00:20:30,236
And the motivation for that is

475
00:20:30,236 --> 00:20:33,956
because you might also
have an email server

476
00:20:33,956 --> 00:20:35,386
on the same physical box, right?

477
00:20:35,386 --> 00:20:37,476
Gmail, it's kind of
an oversimplification

478
00:20:37,476 --> 00:20:39,936
but they are both a website
and an email service.

479
00:20:39,936 --> 00:20:42,656
And if you want to be able
to send email to Gmail,

480
00:20:42,906 --> 00:20:46,386
you can also use TCP but
you have to use port 25.

481
00:20:46,816 --> 00:20:50,526
In other words, if you go
to Gmail.com with a browser,

482
00:20:50,686 --> 00:20:51,996
you obviously want
a webpage back.

483
00:20:52,406 --> 00:20:54,716
So even though you, the
human haven't typed 80,

484
00:20:54,716 --> 00:20:56,226
it's automatically
inserted for you

485
00:20:56,226 --> 00:20:57,926
by your browser behind
the scenes.

486
00:20:58,276 --> 00:21:01,926
But if you sent an email from
Eudora or Apple Mail or Outlook,

487
00:21:01,926 --> 00:21:04,196
or whatever you're using,
you again probably don't have

488
00:21:04,196 --> 00:21:06,996
to care about this detail
but that program is going

489
00:21:06,996 --> 00:21:11,536
to send data still to Gmail.com
but specifically to port 25.

490
00:21:11,946 --> 00:21:14,246
So when a computer is on
the internet, a server

491
00:21:14,246 --> 00:21:17,666
and it's listening for traffic,
all of that traffic comes

492
00:21:17,666 --> 00:21:20,786
in on a specific port,
a specific like pathway

493
00:21:20,786 --> 00:21:22,486
into the server so that it knows

494
00:21:22,486 --> 00:21:24,176
if it's a webpage
or an email, right?

495
00:21:24,176 --> 00:21:25,116
Because especially email.

496
00:21:25,116 --> 00:21:28,206
Emails can contain HTML
now so you need some way

497
00:21:28,206 --> 00:21:30,246
of distinguishing the
two fundamentally.

498
00:21:30,696 --> 00:21:33,176
So when you proposed port
forwarding, what does this mean?

499
00:21:33,366 --> 00:21:36,696
Well, if your home network
has a public IP address

500
00:21:36,696 --> 00:21:38,826
and you usually, again,
get one from your ISP,

501
00:21:38,826 --> 00:21:44,256
and that is some address of
the form, w.x.y.z, and you,

502
00:21:44,256 --> 00:21:46,966
your individual laptop on
which you've created your final

503
00:21:46,966 --> 00:21:48,856
project that you want to
make publicly available,

504
00:21:49,066 --> 00:21:50,506
is that one of these
IP addresses,

505
00:21:50,506 --> 00:21:51,756
doesn't really matter
what it is.

506
00:21:52,076 --> 00:21:54,366
What you can do is
configure your home router,

507
00:21:54,486 --> 00:21:57,746
a.k.a. firewall, a.k.a. cable
modem, it depends on what make

508
00:21:57,746 --> 00:22:00,966
and model you have but that
device, you can configure it

509
00:22:00,966 --> 00:22:04,276
to say, any internet traffic
that comes from the internet

510
00:22:04,476 --> 00:22:07,696
to my home on my public
IP address destined

511
00:22:07,696 --> 00:22:10,746
for port 80 should
be "port forwarded"

512
00:22:10,916 --> 00:22:16,286
to IP address 192.168
dot something port 80.

513
00:22:16,446 --> 00:22:18,076
In other words, you
can tell this machine

514
00:22:18,076 --> 00:22:19,896
to take incoming
data on that port

515
00:22:19,896 --> 00:22:22,686
and then route it very
specifically to this computer,

516
00:22:22,906 --> 00:22:25,266
yours, so that it just works.

517
00:22:26,106 --> 00:22:29,226
Now there is one gotcha here,
especially if you have siblings

518
00:22:29,226 --> 00:22:31,116
for instance or other
technically minded family

519
00:22:31,116 --> 00:22:31,896
members or roommates.

520
00:22:32,286 --> 00:22:35,546
If you're doing port
forwarding in this way,

521
00:22:35,726 --> 00:22:39,816
only one of you can operate
a web server behind your

522
00:22:39,816 --> 00:22:40,476
cable model.

523
00:22:40,476 --> 00:22:42,516
Because you only
have one IP address

524
00:22:42,516 --> 00:22:44,376
to uniquely identify
your website

525
00:22:44,606 --> 00:22:46,796
and if you've already
claimed 80 as your own

526
00:22:46,796 --> 00:22:49,016
and that's the default for
the world's browsers to use,

527
00:22:49,206 --> 00:22:51,606
pretty much only your web
server can be accessed.

528
00:22:51,806 --> 00:22:53,466
Now there is a work around here.

529
00:22:53,466 --> 00:22:55,416
If your roommate is really
ticked off at you, you can say,

530
00:22:55,416 --> 00:22:57,796
"Fine, fine, fine, I
will give you port 81."

531
00:22:57,796 --> 00:23:01,126
But what does that mean, that
means the entire world has

532
00:23:01,396 --> 00:23:05,816
to type out a URL like let's
say your address was indeed,

533
00:23:05,816 --> 00:23:10,346
w.x.y.z this would be
your address, your URL.

534
00:23:10,346 --> 00:23:12,506
Your roommate's, unfortunately,

535
00:23:12,506 --> 00:23:13,796
would be this crazy
looking thing.

536
00:23:14,356 --> 00:23:17,626
Right? Or any number
other-- any number really.

537
00:23:17,816 --> 00:23:19,356
Now there are some
restrictions on the numbers.

538
00:23:19,356 --> 00:23:22,206
You just probably can't use
81, but the point is the same.

539
00:23:22,506 --> 00:23:25,056
This is not standard and you
probably don't want your users

540
00:23:25,056 --> 00:23:27,136
having to remember
such an esoteric detail

541
00:23:27,136 --> 00:23:28,676
as an arbitrary number.

542
00:23:28,906 --> 00:23:34,316
However, if on the internet you
visit any website with colon 80,

543
00:23:34,556 --> 00:23:36,466
odds are you will
get to the website

544
00:23:36,466 --> 00:23:37,676
with which you're familiar.

545
00:23:37,676 --> 00:23:39,036
It's just the browser is again

546
00:23:39,036 --> 00:23:42,576
for user convenience inserting
the port number automatically

547
00:23:42,636 --> 00:23:42,936
for you.

548
00:23:43,456 --> 00:23:47,796
And little trivia for HTTPS,
the secure version of HTTP,

549
00:23:47,796 --> 00:23:48,836
what port number does that use?

550
00:23:48,836 --> 00:23:49,626
>> 443.

551
00:23:50,046 --> 00:23:52,866
>> 443. And you sometimes
do see that in URL.

552
00:23:52,866 --> 00:23:55,386
You also see some other
ports commonly like 8080.

553
00:23:55,386 --> 00:23:58,766
8080 is just kind of an
arbitrary popular port

554
00:23:58,766 --> 00:24:00,996
that some company has used
to run certain services.

555
00:24:01,276 --> 00:24:03,486
But in short, using anything
nonstandard these days

556
00:24:03,486 --> 00:24:05,666
especially for commercial
production websites

557
00:24:05,666 --> 00:24:07,176
where you're trying to
make money or trying

558
00:24:07,176 --> 00:24:11,856
to stay online 100% of the time,
using nonstandard ports is bad.

559
00:24:11,986 --> 00:24:13,166
Because there are
certain companies,

560
00:24:13,166 --> 00:24:13,906
there are certain campuses

561
00:24:13,946 --> 00:24:17,916
that will pretty much block
any port besides 80 and 443.

562
00:24:18,306 --> 00:24:19,606
But thankfully there's
a work around.

563
00:24:19,606 --> 00:24:21,496
Even if you want to
run some random server

564
00:24:21,496 --> 00:24:23,176
like a BitTorrent server
or something like that,

565
00:24:23,666 --> 00:24:27,436
all you have to do is change
the port number to be 80 or 443.

566
00:24:27,696 --> 00:24:29,536
So the reality is
with firewalling,

567
00:24:29,536 --> 00:24:31,606
and we'll have this conversation
toward the end of the semester

568
00:24:31,606 --> 00:24:33,266
when we talk about
security more generally,

569
00:24:33,266 --> 00:24:35,766
and a lot of security
mechanisms are kind of a joke

570
00:24:35,766 --> 00:24:38,186
because all you need is a
modicum of savvy or, you know,

571
00:24:38,186 --> 00:24:40,536
having listened to the
past 30 seconds of words

572
00:24:40,536 --> 00:24:42,806
that I just said and you
can circumvent these kinds

573
00:24:42,806 --> 00:24:43,906
of restrictions.

574
00:24:43,906 --> 00:24:46,236
Hotels do this a lot,
Starbucks does this a lot.

575
00:24:46,496 --> 00:24:49,486
But port numbers are really
just this very basic mechanism

576
00:24:49,586 --> 00:24:51,106
and the world has
adopted some standards.

577
00:24:51,576 --> 00:24:52,546
All right.

578
00:24:53,666 --> 00:24:55,546
So, perfect, we have a solution.

579
00:24:55,546 --> 00:24:58,146
All you have to do is
somehow figure out how

580
00:24:58,146 --> 00:25:02,746
to download the manual for your
Linksys router or Apple AirPort

581
00:25:02,786 --> 00:25:04,576
and you can configure all
this port forwarding stuff

582
00:25:04,576 --> 00:25:05,966
and run a website
from your home.

583
00:25:06,566 --> 00:25:07,736
So not quite, right?

584
00:25:07,736 --> 00:25:10,206
Because if you actually have
a popular website, Verizon

585
00:25:10,206 --> 00:25:11,816
and Comcast might
very well notice

586
00:25:11,816 --> 00:25:13,106
and just shut you off entirely

587
00:25:13,356 --> 00:25:16,276
because that huge disclosure
agreement you probably clicked

588
00:25:16,276 --> 00:25:17,536
through and never read
when you've signed

589
00:25:17,536 --> 00:25:20,576
up for internet service probably
said you may not run a website

590
00:25:20,576 --> 00:25:21,816
on your home computer.

591
00:25:22,296 --> 00:25:24,556
So plus this was a pain
in the neck to do anyway.

592
00:25:24,556 --> 00:25:27,266
So we might-- plus I
unplug my laptop sometimes

593
00:25:27,266 --> 00:25:29,996
and so my website is going to go
down anytime I go to-- go out.

594
00:25:30,436 --> 00:25:32,716
So not the best solution
even if you have a desktop.

595
00:25:32,716 --> 00:25:35,056
So let's at least try to push
a little harder and assume

596
00:25:35,056 --> 00:25:37,146
that we need to outsource this
problem or we at least need

597
00:25:37,146 --> 00:25:40,406
to put a computer on
the internet itself

598
00:25:40,466 --> 00:25:42,776
in a data center, on a campus
where it can stay plugged

599
00:25:42,776 --> 00:25:44,576
in perpetually under
your desk at work,

600
00:25:44,576 --> 00:25:46,176
if the sys admins allow it.

601
00:25:46,176 --> 00:25:51,136
And moreover, I don't want
my website to live at w.x.y.z

602
00:25:51,136 --> 00:25:52,876
or any number for that matter.

603
00:25:53,096 --> 00:25:57,136
I want it to live it, david.com
or some URL that is sort

604
00:25:57,136 --> 00:25:59,376
of distinctly my
brand or my name.

605
00:25:59,606 --> 00:26:01,236
So, that begs the
question, how do you go

606
00:26:01,236 --> 00:26:04,966
about getting your
own domain name.

607
00:26:05,556 --> 00:26:06,776
Has anyone done this before?

608
00:26:07,896 --> 00:26:08,946
Yeah, how do you it?

609
00:26:09,676 --> 00:26:11,336
>> I purchase them.

610
00:26:11,556 --> 00:26:12,556
>> OK. Where do you
purchase them?

611
00:26:12,556 --> 00:26:14,496
>> I got mine in Namecheap.

612
00:26:14,496 --> 00:26:17,786
>> OK so Namecheap.com
is a very popular place,

613
00:26:17,986 --> 00:26:18,786
fairly inexpensive.

614
00:26:18,786 --> 00:26:21,056
GoDaddy is another
very popular place.

615
00:26:21,056 --> 00:26:25,636
This one is kind of riddled
with upsell attempts,

616
00:26:25,636 --> 00:26:28,066
trying to get you to buy
everything in the kitchen sink.

617
00:26:28,306 --> 00:26:29,206
But you don't need to do that.

618
00:26:29,206 --> 00:26:33,036
There's all sorts of domain name
registrars out there these days.

619
00:26:33,036 --> 00:26:34,346
Many-- A bunch of years ago,

620
00:26:34,526 --> 00:26:36,246
Network Solutions
was the only one.

621
00:26:36,506 --> 00:26:39,046
But then a market was
created and so there's lots

622
00:26:39,096 --> 00:26:40,456
of places to buy domain names.

623
00:26:40,506 --> 00:26:42,266
For the most part
it doesn't matter

624
00:26:42,266 --> 00:26:44,006
where you buy your
domain name from.

625
00:26:44,006 --> 00:26:45,716
But you do sometimes
get different features.

626
00:26:46,026 --> 00:26:49,516
In particular, you get DNS
feature sometimes, more control

627
00:26:49,516 --> 00:26:50,576
over your DNS servers.

628
00:26:50,826 --> 00:26:53,236
They might throw in free
email accounts, free hosting

629
00:26:53,496 --> 00:26:55,416
but for the most part it
doesn't matter huge amount,

630
00:26:55,546 --> 00:26:57,776
in particular you don't
need to go to someone

631
00:26:57,776 --> 00:27:01,066
like Network Solutions and pay
$30 a year when you could go

632
00:27:01,066 --> 00:27:03,666
to someone like GoDaddy
and pay 9.99 a year

633
00:27:03,666 --> 00:27:06,216
or Namecheap and
pay 4.99 a year.

634
00:27:06,426 --> 00:27:07,806
So in short, just paying more

635
00:27:07,806 --> 00:27:10,826
for a domain name isn't
necessarily giving you anything

636
00:27:10,956 --> 00:27:14,786
more in the way of
functionality.

637
00:27:14,786 --> 00:27:16,936
It depends on what
maybe the add-ons are.

638
00:27:17,486 --> 00:27:19,106
So, how do we go
about doing this?

639
00:27:19,106 --> 00:27:20,936
Well, let's go to
something like GoDaddy.

640
00:27:20,936 --> 00:27:23,116
GoDaddy is kind of a-- Well,
let's actually try Namecheap.

641
00:27:23,306 --> 00:27:25,596
Let's go to Namecheap and
see what they look like.

642
00:27:25,596 --> 00:27:26,306
Namecheap.

643
00:27:27,616 --> 00:27:29,696
Bunch of my friends have
indeed used these websites.

644
00:27:29,746 --> 00:27:31,076
All right.

645
00:27:31,136 --> 00:27:32,806
So let's see, domain
name to search.

646
00:27:32,806 --> 00:27:36,486
I'm going to search for
david.com, probably taken.

647
00:27:36,896 --> 00:27:37,756
Oh, that is a good price.

648
00:27:38,076 --> 00:27:39,486
We're already doing
better than GoDaddy.

649
00:27:39,686 --> 00:27:42,036
All right, so as I
expected, it is taken,

650
00:27:42,036 --> 00:27:44,076
as are almost all
forms of David.

651
00:27:45,606 --> 00:27:48,196
They've suggested I name
myself David John, David Smith,

652
00:27:48,286 --> 00:27:51,326
David Johnson, King David,
David photography dot US.

653
00:27:51,706 --> 00:27:52,946
So, one of the hardest
things frankly

654
00:27:52,946 --> 00:27:54,986
of starting a business these
days is finding an available

655
00:27:54,986 --> 00:27:57,766
domain name, let alone your own
personal vanity domain names

656
00:27:57,766 --> 00:27:58,586
for people's names.

657
00:27:58,956 --> 00:28:01,136
But if we found something
we liked,

658
00:28:01,136 --> 00:28:05,546
maybe I do want DavidTV
dot-- well, that's atrocious,

659
00:28:05,886 --> 00:28:08,316
$6000 for this domain
but if it's not

660
00:28:08,316 --> 00:28:11,406
yet taken it's probably one
of the cheaper ones up above.

661
00:28:11,406 --> 00:28:13,826
So let's assume we found
something we're happy with.

662
00:28:14,066 --> 00:28:16,516
So we add it to our
cart and we check out.

663
00:28:16,816 --> 00:28:19,866
I now own some domain name,
David something dot com.

664
00:28:20,426 --> 00:28:21,886
So what now do I do with it?

665
00:28:22,096 --> 00:28:25,266
How do I associate it with my
web server, and for that matter,

666
00:28:25,266 --> 00:28:26,146
how do we get a web server?

667
00:28:26,146 --> 00:28:27,726
Let's assume I have a web
server and we'll cross

668
00:28:27,726 --> 00:28:29,246
that bridge in a moment.

669
00:28:29,716 --> 00:28:32,076
But I have a domain name,
what do I need to do

670
00:28:32,076 --> 00:28:33,476
with it to start using it?

671
00:28:33,476 --> 00:28:36,446
Well, I need to tell the
world what my IP address is.

672
00:28:36,826 --> 00:28:40,656
So I need to somehow tell
the world that my server,

673
00:28:40,656 --> 00:28:42,166
I don't know who's
going to be hosting it

674
00:28:42,446 --> 00:28:43,826
but I know it will
have an IP address

675
00:28:43,896 --> 00:28:45,176
by nature of how the web works.

676
00:28:45,556 --> 00:28:47,506
So let's assume I know
the IP address is going

677
00:28:47,506 --> 00:28:51,976
to be w.x.y.z. I somehow have
to inform the whole world

678
00:28:52,116 --> 00:28:56,046
that david.com's IP
address is w.x.y.z. So one

679
00:28:56,046 --> 00:28:59,056
of the things I'll have to do
at Namecheap.com or GoDaddy

680
00:28:59,056 --> 00:29:03,886
or networksolutions.com is I
tell the registrar not what my

681
00:29:03,886 --> 00:29:05,866
own computer's IP
address will be

682
00:29:06,156 --> 00:29:07,916
but rather what the IP address

683
00:29:08,616 --> 00:29:12,626
of my domain name's
DNS servers will be.

684
00:29:12,766 --> 00:29:15,376
And the convention is
typically that every domain name

685
00:29:15,376 --> 00:29:18,656
in the world should have two DNS
servers, primary and secondary,

686
00:29:18,946 --> 00:29:20,396
so a main one and a backup one.

687
00:29:20,656 --> 00:29:23,106
They can be one in the same but
the world really pushes people

688
00:29:23,106 --> 00:29:25,706
to having at least two for the
sake of uptime and redundancy.

689
00:29:26,076 --> 00:29:29,476
So I need to know not my own
IP address per se but I need

690
00:29:29,476 --> 00:29:32,996
to know the IP address of one
and then a second DNS server.

691
00:29:33,186 --> 00:29:35,366
Now I don't have my own DNS
servers and I don't want

692
00:29:35,366 --> 00:29:37,226
to go have to configure
two more computers

693
00:29:37,226 --> 00:29:38,446
in addition to my web server.

694
00:29:38,676 --> 00:29:40,746
So this is where web
hosting companies come in.

695
00:29:41,006 --> 00:29:42,746
So in addition to
buying the domain name,

696
00:29:42,746 --> 00:29:45,076
I also want to host
my website somewhere

697
00:29:45,266 --> 00:29:47,936
and it could very well be
the same exact company.

698
00:29:47,936 --> 00:29:50,346
It could be GoDaddy, it
could be Namecheap depending

699
00:29:50,346 --> 00:29:51,876
on the service that
they provide.

700
00:29:52,256 --> 00:29:56,806
But we need to have
a web hosting option.

701
00:29:56,896 --> 00:29:57,976
So what's a web host
going to give us?

702
00:29:58,096 --> 00:30:02,296
A web host is going to give us
a hard drive to put my files on,

703
00:30:02,416 --> 00:30:03,916
you know, maybe not
a hard drive per se,

704
00:30:03,916 --> 00:30:05,436
but some illusion
of storage space.

705
00:30:05,916 --> 00:30:08,366
They are going to have their
own connections to the internet,

706
00:30:08,686 --> 00:30:09,806
this web hosting company.

707
00:30:10,046 --> 00:30:12,066
They are hopefully going to
have a pool of IP addresses

708
00:30:12,066 --> 00:30:13,596
so that I can have
at least one of them.

709
00:30:13,596 --> 00:30:15,986
They're also going
to have some RAM.

710
00:30:16,046 --> 00:30:18,386
They're also going to have
technical support staff.

711
00:30:18,386 --> 00:30:20,186
In short, they're going
to have a server and all

712
00:30:20,186 --> 00:30:22,806
of the things necessary to keep
a server alive on the internet.

713
00:30:23,006 --> 00:30:23,956
And hopefully, they're
also going

714
00:30:23,956 --> 00:30:25,886
to have at least two of what?

715
00:30:27,476 --> 00:30:28,696
DNS servers.

716
00:30:28,926 --> 00:30:31,176
So if I decide to
host my website,

717
00:30:31,176 --> 00:30:32,586
let's say DreamHost.com.

718
00:30:32,586 --> 00:30:35,806
This is a very popular
sort of el cheapo kind

719
00:30:35,806 --> 00:30:40,556
of web hosting company that
I've used myself in the past,

720
00:30:40,556 --> 00:30:43,576
it's like 6.95 or 8.95 a
month, so that's pretty good,

721
00:30:43,576 --> 00:30:44,956
but again, you get
what you pay for.

722
00:30:44,956 --> 00:30:47,506
I wouldn't necessarily
build a big business on it.

723
00:30:47,506 --> 00:30:51,726
So for 8.95 a month, I have
the ability to upload my HTML

724
00:30:51,886 --> 00:30:53,076
and CSS files and soon PHP

725
00:30:53,076 --> 00:30:56,116
and JavaScript files
to their server.

726
00:30:56,416 --> 00:31:00,046
Their server has
nearby two DNS servers,

727
00:31:00,046 --> 00:31:01,766
each of which have
their own IP addresses.

728
00:31:01,766 --> 00:31:05,116
So once I know what
DreamHost's IP addresses are

729
00:31:05,116 --> 00:31:07,976
for its name servers, I
tell Namecheap, or GoDaddy

730
00:31:07,976 --> 00:31:10,406
or wherever I bought my
domain name, and that's it.

731
00:31:10,406 --> 00:31:11,576
The only time I have to talk

732
00:31:11,576 --> 00:31:13,886
to my registrar again
most likely is in a year

733
00:31:13,886 --> 00:31:18,366
when they charge me another
5.99 or $99 for my domain name.

734
00:31:18,366 --> 00:31:21,036
Unfortunately, buying, you're
really renting your domain name

735
00:31:21,106 --> 00:31:22,046
from these registrars.

736
00:31:22,876 --> 00:31:25,116
Now, there's a whole bunch
more involved in setting

737
00:31:25,116 --> 00:31:28,526
up of the web server and
getting my files there,

738
00:31:28,766 --> 00:31:32,036
but at least now I've told the
world that if you want to know

739
00:31:32,036 --> 00:31:36,316
where david.com is, ask these
people, these two IP addresses

740
00:31:36,316 --> 00:31:37,646
of the name server, either one.

741
00:31:37,806 --> 00:31:41,336
And those IP-- those DNS
servers should hopefully know.

742
00:31:41,336 --> 00:31:43,906
Why? Because so long as
I keep paying DreamHost

743
00:31:43,906 --> 00:31:47,366
or someone else 8.95 per month,
they will ensure that both

744
00:31:47,366 --> 00:31:51,496
of those DNS servers know what
my own website's IP address is.

745
00:31:51,496 --> 00:31:52,376
And how will they know?

746
00:31:52,576 --> 00:31:55,146
Because of what I'm paying
for is some storage space

747
00:31:55,276 --> 00:31:57,886
and some internet connectivity
on one of their servers.

748
00:31:57,886 --> 00:31:59,756
One of their servers
has an IP address,

749
00:32:00,016 --> 00:32:01,626
so they just tell
their DNS servers

750
00:32:01,626 --> 00:32:04,906
that david.com's IP address
is whatever the IP address is

751
00:32:04,906 --> 00:32:07,176
of the server they've told
me to put my content on.

752
00:32:08,306 --> 00:32:10,376
And we'll actually look
in a little more detail

753
00:32:10,376 --> 00:32:13,476
of what's involved in that.

754
00:32:13,476 --> 00:32:16,476
But any questions?

755
00:32:16,556 --> 00:32:22,406
So, in answer to the
somewhat frequent problem

756
00:32:22,406 --> 00:32:26,126
where a website does work
at www.something.com but not

757
00:32:26,126 --> 00:32:28,946
at something.com, how do
you fix something like that?

758
00:32:28,946 --> 00:32:30,886
There's usually two
pieces to the solution.

759
00:32:31,096 --> 00:32:34,286
One, you have to make sure
that there's a DNS record

760
00:32:34,636 --> 00:32:38,536
for something.com, that is
there's an IP address associated

761
00:32:38,536 --> 00:32:40,466
with it in addition to
one being associated

762
00:32:40,466 --> 00:32:44,996
with www.something.com and you
need to configure the web server

763
00:32:45,126 --> 00:32:47,966
to accept requests for
either something.com

764
00:32:47,966 --> 00:32:50,976
or www.something.com.

765
00:32:51,246 --> 00:32:55,856
But really, let's focus on
just this DNSPs for now.

766
00:32:56,436 --> 00:33:00,436
So DNS. Turns out DNS is
relatively straightforward,

767
00:33:00,436 --> 00:33:02,786
and once you start operating
a whole bunch of services

768
00:33:02,786 --> 00:33:04,986
on your own website, maybe
you have an email server,

769
00:33:05,206 --> 00:33:08,236
maybe you want to use hosted
services like Google Calendar,

770
00:33:08,806 --> 00:33:11,456
maybe Google Docs, you
can do things like--

771
00:33:11,906 --> 00:33:14,536
actually for CS-75, for
this course, the TFs

772
00:33:14,536 --> 00:33:19,946
and I use Gmail essentially
to host cs75.net's email.

773
00:33:20,446 --> 00:33:24,166
So that's the website as
we'll-- as I'll soon reveal,

774
00:33:24,166 --> 00:33:25,846
if you haven't pulled
up the website.

775
00:33:25,846 --> 00:33:28,166
And we want to be able to have
an email list so that each

776
00:33:28,166 --> 00:33:30,016
of us can email everyone
else very easily.

777
00:33:30,016 --> 00:33:34,026
So we want email addresses
of the form mailing@cs75.net.

778
00:33:34,536 --> 00:33:35,356
Now, how do we do this?

779
00:33:35,356 --> 00:33:38,086
Well, we could set up a mail
server, we could pay someone

780
00:33:38,086 --> 00:33:40,886
to do this, but an amazing
servers out there is Google Apps

781
00:33:40,886 --> 00:33:43,726
with which some of you might
be familiar, and for small fish

782
00:33:43,726 --> 00:33:45,916
like us, where we only
have a few people on staff,

783
00:33:46,426 --> 00:33:49,136
you can actually have hosted
Gmail, hosted Google Calendar,

784
00:33:49,136 --> 00:33:53,526
hosted Google Documents for
I think 20 or fewer people

785
00:33:53,586 --> 00:33:56,986
for free, and what you do is you
configure your own DNS servers

786
00:33:57,256 --> 00:34:01,846
to map something
like mail.cs75.net

787
00:34:02,386 --> 00:34:04,526
to essentially Gmail.com.

788
00:34:04,776 --> 00:34:06,936
So that whenever we send
an email to something

789
00:34:06,936 --> 00:34:10,296
of the form mail.cs75.net,
it figures out via DNS

790
00:34:10,296 --> 00:34:12,076
to actually go to Google.

791
00:34:12,076 --> 00:34:15,016
We could have calendar.cs75.net
and you hit enter,

792
00:34:15,156 --> 00:34:16,746
you actually end up
at Google Calendar,

793
00:34:16,746 --> 00:34:18,616
but our copy of Google Calendar.

794
00:34:18,616 --> 00:34:19,596
And this is all thanks to DNS.

795
00:34:19,596 --> 00:34:21,966
And there's only a few
settings with which you need

796
00:34:21,966 --> 00:34:23,956
to be familiar, and we
already talked about this one.

797
00:34:24,246 --> 00:34:30,096
An NS record is a
record in a DNS server

798
00:34:30,396 --> 00:34:34,346
that tells the world what the
IP address is for that domain.

799
00:34:35,176 --> 00:34:37,356
So, what's inside a DNS server?

800
00:34:37,536 --> 00:34:39,746
Frankly, it's a database
and you can think of it

801
00:34:39,746 --> 00:34:42,766
as like a database with
Excel files, so spreadsheets

802
00:34:42,766 --> 00:34:44,136
that just have rows and columns.

803
00:34:44,576 --> 00:34:46,906
And those columns
essentially represent--

804
00:34:47,786 --> 00:34:49,916
well, in each row
rather, you would have

805
00:34:49,916 --> 00:34:52,276
for instance a domain
name and an IP address.

806
00:34:52,276 --> 00:34:54,476
Domain name, IP address,
domain name, IP address.

807
00:34:54,476 --> 00:34:56,086
That's really all that's
underneath the hood

808
00:34:56,086 --> 00:34:58,526
in a DNS server, at least
so far as we're concerned.

809
00:34:58,916 --> 00:35:00,566
But there are different
types of rows.

810
00:35:01,026 --> 00:35:03,876
So one of those rows can
be an official record

811
00:35:04,036 --> 00:35:05,806
that says the name server, NS,

812
00:35:06,406 --> 00:35:09,336
for this domain is whatever
IP address DreamHost gave me

813
00:35:09,776 --> 00:35:10,226
for instance.

814
00:35:11,206 --> 00:35:12,716
Now, what else can I have?

815
00:35:12,926 --> 00:35:14,246
Well, there's an A record.

816
00:35:14,396 --> 00:35:18,226
So an A record, a row of
type A in the spreadsheet

817
00:35:18,226 --> 00:35:21,116
of sorts is literally
domain name to IP address.

818
00:35:21,116 --> 00:35:22,486
It's as simple as that.

819
00:35:22,486 --> 00:35:24,926
So if I have something.com and
its IP address should be x--

820
00:35:24,926 --> 00:35:28,026
w.x.y.z, that's what's
called an A record.

821
00:35:28,476 --> 00:35:32,316
And I can also have
mail.something.com

822
00:35:32,316 --> 00:35:33,906
or calendar.something.com

823
00:35:34,056 --> 00:35:35,806
and I can associate
with an IP address.

824
00:35:35,886 --> 00:35:36,736
And how do I do this?

825
00:35:36,736 --> 00:35:38,726
It totally depends
on your registrar

826
00:35:38,726 --> 00:35:41,006
or on your DNS provider,
whether it's DreamHost

827
00:35:41,006 --> 00:35:41,826
or GoDaddy or the like.

828
00:35:42,046 --> 00:35:43,786
But these days, it's
usually a web interface.

829
00:35:43,786 --> 00:35:45,666
Back in the day, it
was a command line,

830
00:35:45,666 --> 00:35:47,196
you edit a text file
on a server,

831
00:35:47,196 --> 00:35:49,496
but these days it's been made
to be more user friendly.

832
00:35:49,586 --> 00:35:51,256
But it's essentially
a spreadsheet.

833
00:35:51,816 --> 00:35:54,916
Now, there's two
slightly fancier features.

834
00:35:55,136 --> 00:35:58,446
A CNAME or canonical
name is an alias.

835
00:35:58,936 --> 00:36:01,866
So it turns out with a
lot of these web services

836
00:36:02,106 --> 00:36:05,516
like Google Apps, where Google
is providing the service,

837
00:36:06,176 --> 00:36:07,566
you don't necessarily
want to have

838
00:36:07,626 --> 00:36:10,636
to know what Google's
IP address is, right?

839
00:36:10,636 --> 00:36:12,686
Because one, you probably don't
know anyone who works there,

840
00:36:12,686 --> 00:36:13,916
and so you can't
really ask them.

841
00:36:13,916 --> 00:36:16,006
Now, frankly, you can run a
command and figure it out,

842
00:36:16,326 --> 00:36:19,906
but if you hardcode into your
DNS server the IP address

843
00:36:19,906 --> 00:36:22,656
of google.com, the implication
is that if they ever need

844
00:36:22,656 --> 00:36:25,336
to change their IP address,
which happens, not everyday,

845
00:36:25,336 --> 00:36:26,406
but you know, every few months,

846
00:36:26,406 --> 00:36:28,636
a few years for whatever
technical reasons,

847
00:36:29,056 --> 00:36:30,336
now your website goes down.

848
00:36:30,796 --> 00:36:34,186
It's-- would kind of be better,
at least like it conceptually

849
00:36:34,186 --> 00:36:38,776
if calendar.something.com didn't
resolve to Google's IP address,

850
00:36:39,066 --> 00:36:39,876
but rather, what

851
00:36:39,876 --> 00:36:43,606
if calendar.google.com could
instead resolve more generically

852
00:36:43,606 --> 00:36:46,226
to calendar.google.com.

853
00:36:46,656 --> 00:36:49,436
So don't have your domain
mapped to an IP address,

854
00:36:49,436 --> 00:36:53,646
have your domain name mapped
to someone else's domain name,

855
00:36:53,926 --> 00:36:58,066
and then let their DNS server
tell the world what the current

856
00:36:58,066 --> 00:37:00,606
IP address is of
calendar.google.com.

857
00:37:00,846 --> 00:37:03,016
So in other words, if you want
this layer of abstraction,

858
00:37:03,136 --> 00:37:06,246
where you don't care what the
IP address is, you just care

859
00:37:06,356 --> 00:37:08,056
that your domain
name be a synonym

860
00:37:08,056 --> 00:37:11,146
for someone else's domain name,
then you use a CNAME record.

861
00:37:11,256 --> 00:37:13,216
And what the two columns
look like are domain name,

862
00:37:13,216 --> 00:37:16,196
domain name instead of
domain name, IP address.

863
00:37:16,616 --> 00:37:19,386
So it's a wonderful useful
feature, especially these days,

864
00:37:19,386 --> 00:37:21,296
if you look into
hosted solutions,

865
00:37:21,296 --> 00:37:24,566
not just like Google Apps, but
companies that have services

866
00:37:24,566 --> 00:37:27,196
like customer service forums.

867
00:37:27,466 --> 00:37:29,556
If you go to a website,
they'll often have an address

868
00:37:29,556 --> 00:37:33,086
like support.dell.com or
the like or there's a lot

869
00:37:33,086 --> 00:37:33,956
of companies these days

870
00:37:33,956 --> 00:37:38,006
that provide customer service
websites, but it would look kind

871
00:37:38,006 --> 00:37:40,696
of lame if I go to dell.com
and I get redirected

872
00:37:40,696 --> 00:37:42,136
to customerservice.com.

873
00:37:42,406 --> 00:37:47,086
Dell would rather rebrand
someone else's service to look

874
00:37:47,086 --> 00:37:49,366
like Dell even though
someone else implemented it

875
00:37:49,366 --> 00:37:50,196
and is hosting it.

876
00:37:50,516 --> 00:37:52,756
So via a CNAME, someone

877
00:37:52,756 --> 00:37:56,126
like Dell could say
support.dell.com should actually

878
00:37:56,126 --> 00:37:59,146
resolve to customerservice.com,
but the user should never know

879
00:37:59,146 --> 00:38:01,996
that because the URL
stays support.dell.com.

880
00:38:01,996 --> 00:38:03,546
So that's just one of
the things you can do

881
00:38:03,546 --> 00:38:05,176
with these things called CNAMEs.

882
00:38:05,646 --> 00:38:09,156
And lastly, an MX record
is a mail exchange record.

883
00:38:09,406 --> 00:38:12,276
And a mail exchange record
simply states what is the IP

884
00:38:12,276 --> 00:38:15,416
address of the server or servers

885
00:38:15,586 --> 00:38:18,436
that should handle inbound
mail for this domain name.

886
00:38:18,436 --> 00:38:22,166
And this is great, because
when you use Eudora or Gmail

887
00:38:22,166 --> 00:38:26,856
or Outlook and you type in
deardavidmalin@harvard.edu

888
00:38:27,076 --> 00:38:28,946
and hit enter, similarly there,

889
00:38:28,946 --> 00:38:31,306
you have no idea what
Harvard's IP address is,

890
00:38:31,586 --> 00:38:33,936
but your computer does,
but it's not the IP address

891
00:38:33,936 --> 00:38:37,386
of harvard.edu per se that
your email client needs,

892
00:38:37,386 --> 00:38:40,736
it's the IP address of Harvard's
mail server, so thanks to DNS,

893
00:38:41,016 --> 00:38:45,266
your Mac or PC can ask your
ISP's DNS server or dot dot dot,

894
00:38:45,266 --> 00:38:47,006
this whole hierarchy
we discussed earlier,

895
00:38:47,286 --> 00:38:50,206
can say what is the MX
record for harvard.edu.

896
00:38:50,416 --> 00:38:54,916
And harvard.edu's domain's name
server should be able to say

897
00:38:55,126 --> 00:38:56,616
"send all mail to
this IP address."

898
00:38:56,846 --> 00:38:59,496
And what's nice about MX records
is you can have multiple ones

899
00:38:59,496 --> 00:39:03,916
with priorities, so
websites or rather domains

900
00:39:03,916 --> 00:39:06,506
that have very large
numbers of users

901
00:39:06,506 --> 00:39:08,616
where you really don't want
their mail servers going down,

902
00:39:08,896 --> 00:39:11,446
you can have 2 or 3 or 10
different mail servers,

903
00:39:11,806 --> 00:39:14,436
and the DNS system will say,
try this one, then this one,

904
00:39:14,436 --> 00:39:15,536
then this one, then this one,

905
00:39:15,776 --> 00:39:17,646
just in case any of
those go offline.

906
00:39:17,916 --> 00:39:18,986
And it's all thanks to DNS.

907
00:39:19,626 --> 00:39:21,926
And while we sort of take
all of these for granted,

908
00:39:22,136 --> 00:39:24,136
once you start developing
your own websites,

909
00:39:24,136 --> 00:39:27,256
maybe creating your own
companies or contributing back

910
00:39:27,256 --> 00:39:29,886
to your own school, having
these abilities is wonderfully

911
00:39:29,886 --> 00:39:33,706
powerful, and it really
boils down to these basics.

912
00:39:34,306 --> 00:39:36,156
Any questions?

913
00:39:37,096 --> 00:39:38,146
No? All right.

914
00:39:39,976 --> 00:39:42,286
So that was kind of long.

915
00:39:42,286 --> 00:39:44,336
Why don't we take a three
or so minute break here?

916
00:39:44,336 --> 00:39:46,366
There's restrooms in the
hallway, there's soda machines,

917
00:39:46,366 --> 00:39:49,126
I think, around the whole
corner, and we'll rejoin

918
00:39:49,126 --> 00:39:50,126
in about three minutes.

919
00:39:50,266 --> 00:39:50,766
All right.

920
00:39:52,216 --> 00:39:53,546
We are back.

921
00:39:53,876 --> 00:39:56,576
So why don't we take a
look at the course itself

922
00:39:56,576 --> 00:39:59,976
and what you are in for and what
the course's expectations are.

923
00:40:00,186 --> 00:40:01,776
So, in terms of prerequisites,

924
00:40:01,776 --> 00:40:03,306
the official prerequisites
are these.

925
00:40:03,306 --> 00:40:05,326
So multiple years of
programing experience as well

926
00:40:05,326 --> 00:40:07,806
as comfort with HTML and CSS.

927
00:40:07,806 --> 00:40:09,396
So, what is this
mean in real terms?

928
00:40:09,396 --> 00:40:11,226
So, summer school is very short.

929
00:40:11,226 --> 00:40:12,496
It's, just about six weeks

930
00:40:12,526 --> 00:40:16,176
and the course has three
nontrivially sized projects

931
00:40:16,426 --> 00:40:18,376
and the goal really is to
make sure that at the end

932
00:40:18,376 --> 00:40:21,526
of the short semester, you feel
quite comfortable going off

933
00:40:21,526 --> 00:40:23,716
and doing much more
on your own in the way

934
00:40:23,716 --> 00:40:27,446
of web site development not
just HTML and CSS in the form

935
00:40:27,446 --> 00:40:30,336
of static websites but truly the
dynamic websites that are driven

936
00:40:30,336 --> 00:40:34,746
by a language like PHP and
JavaScript back to the database

937
00:40:34,746 --> 00:40:37,816
like MySQL so it's a
fairly intense course.

938
00:40:38,686 --> 00:40:42,616
If you've only taken something
like computer science S1,

939
00:40:42,616 --> 00:40:46,096
the introductory computer
science class or just one

940
00:40:46,096 --> 00:40:48,476
or two courses, I will
say from past experience,

941
00:40:48,746 --> 00:40:52,916
you'll probably find the course
challenging to say the least

942
00:40:53,296 --> 00:40:56,806
and typically will you estimate
about 30 hours per projects.

943
00:40:56,806 --> 00:40:57,886
So, there's three projects

944
00:40:57,886 --> 00:41:00,176
so you have roughly nine
days for each of them.

945
00:41:00,176 --> 00:41:03,056
It's about 30 hours each
but that was beyond average

946
00:41:03,056 --> 00:41:06,796
so student for whom programming
is a little less familiar

947
00:41:06,796 --> 00:41:08,706
or it's been a bunch of
years since you programmed

948
00:41:08,706 --> 00:41:10,776
or you've only taken one
or two introductory courses

949
00:41:10,776 --> 00:41:13,136
but don't really think of
yourself as a programmer,

950
00:41:13,476 --> 00:41:15,186
the course is definitely
more challenging.

951
00:41:15,186 --> 00:41:17,506
So, do be aware diving in.

952
00:41:17,806 --> 00:41:20,786
I will say if you're on
that fence and not sure

953
00:41:20,786 --> 00:41:23,016
if you're comfort level
and background is there,

954
00:41:23,256 --> 00:41:27,316
you can go to CS75.tv which
is the open courseware site

955
00:41:27,316 --> 00:41:30,376
for the course or we have
several previous semesters worth

956
00:41:30,376 --> 00:41:33,546
of lecture videos,
hand outs, projects,

957
00:41:33,606 --> 00:41:37,186
some of which will be similar
to the summers so by looking

958
00:41:37,186 --> 00:41:39,066
to the past, you
can perhaps infer

959
00:41:39,066 --> 00:41:41,626
as to what the summer will
be like and get a sense then

960
00:41:41,906 --> 00:41:44,996
if the PDFs of past year's
projects completely overwhelm

961
00:41:44,996 --> 00:41:47,676
you or completely excite
you so I would try to use

962
00:41:47,676 --> 00:41:50,946
that as an additional input
tonight before deciding whether

963
00:41:50,946 --> 00:41:52,176
this is the course for you.

964
00:41:52,666 --> 00:41:56,076
In terms of expectations
there are these three projects

965
00:41:56,076 --> 00:41:58,926
in attending or watching, if
distant or unable to make it

966
00:41:59,416 --> 00:42:00,806
in person the lectures,

967
00:42:00,806 --> 00:42:03,086
the lectures will be
structured as follows.

968
00:42:03,086 --> 00:42:05,746
So, tonight our focus
is on HTTP and sort

969
00:42:05,746 --> 00:42:08,086
of the mechanics the underlying
fundamentals of the internet

970
00:42:08,086 --> 00:42:09,856
that for years you've
probably taken for granted

971
00:42:09,856 --> 00:42:12,386
but once you really start
building your own websites

972
00:42:12,386 --> 00:42:15,256
and having to negotiate things
like configurations of servers

973
00:42:15,256 --> 00:42:18,496
and code and databases, tonight
we'll start looking at some

974
00:42:18,496 --> 00:42:19,836
of those more technical details.

975
00:42:20,116 --> 00:42:24,466
On Wednesday and next week,
we'll look at PHP itself.

976
00:42:24,466 --> 00:42:28,516
So, one of the PHPs most
compelling features these days

977
00:42:28,516 --> 00:42:30,906
is one, it's syntactically
very familiar--

978
00:42:30,906 --> 00:42:33,866
very similar to languages
with which many of the folks

979
00:42:33,866 --> 00:42:36,806
in this room and at home
are familiar syntactically,

980
00:42:36,806 --> 00:42:41,486
it's similar to C and C++ and
other procedural languages.

981
00:42:41,486 --> 00:42:43,776
It's very much in
vague, pretty popular,

982
00:42:43,776 --> 00:42:45,626
it's pretty omnipresent
these days in terms

983
00:42:45,626 --> 00:42:47,536
of the web hosting companies
that they're out there

984
00:42:47,626 --> 00:42:50,566
and it's super easy to get set
up even Mac OS comes with PHP

985
00:42:50,566 --> 00:42:54,376
and Apache, the web server pre
installed and there are packages

986
00:42:54,376 --> 00:42:56,066
for Windows and Linux
and other computers

987
00:42:56,066 --> 00:42:58,906
that make it super simple
to get set up in terms

988
00:42:58,906 --> 00:43:01,936
of related languages, Python

989
00:43:01,936 --> 00:43:04,546
and Ruby are probably the two
closest contenders in terms

990
00:43:04,546 --> 00:43:06,376
of popularity with PHP.

991
00:43:06,376 --> 00:43:08,606
And none of this is necessarily
better than the others.

992
00:43:08,606 --> 00:43:11,276
It could vary quickly devolved
into religious debates but one

993
00:43:11,276 --> 00:43:13,726
of the nice things about PHP
is again the omnipresence

994
00:43:13,726 --> 00:43:14,966
of support for it out there

995
00:43:15,136 --> 00:43:17,966
and also I think
pedagogically the documentation

996
00:43:17,966 --> 00:43:19,486
for PHP is outstanding

997
00:43:19,486 --> 00:43:24,066
and as you'll see the PHP.net
online reference manual

998
00:43:24,066 --> 00:43:26,976
for functions and whatnot
is rich with examples,

999
00:43:26,976 --> 00:43:29,216
intelligent discussions
and so we've just found

1000
00:43:29,476 --> 00:43:31,436
that it's a very
nice way of diving

1001
00:43:31,436 --> 00:43:32,986
in deeper to web programming.

1002
00:43:32,986 --> 00:43:35,196
And from a course like this, you
should be able to continue on.

1003
00:43:35,446 --> 00:43:38,186
If you haven't come from that
direction already to the likes

1004
00:43:38,216 --> 00:43:42,196
of Python and Ruby,
JSP for the JavaWorld,

1005
00:43:42,196 --> 00:43:43,726
ASPs for the Windows's world.

1006
00:43:43,946 --> 00:43:45,906
There's a lot of
commonalities among them.

1007
00:43:46,056 --> 00:43:49,166
We'll transition in lecture
three to looking at XML.

1008
00:43:49,166 --> 00:43:49,936
So, when it comes time

1009
00:43:49,936 --> 00:43:52,776
to actually stored data whether
statically or dynamically,

1010
00:43:52,776 --> 00:43:54,356
you don't need a
full fledge database.

1011
00:43:54,356 --> 00:43:56,066
You don't need MYSQL,
you don't need Oracle,

1012
00:43:56,066 --> 00:43:57,456
you don't need anything
along those lines,

1013
00:43:57,456 --> 00:43:59,506
you could just use text
files but it would be nice

1014
00:43:59,506 --> 00:44:02,066
if it's easy to read and
write those text files

1015
00:44:02,066 --> 00:44:06,386
so XML is a very popular
language of sort to metalanguage

1016
00:44:06,386 --> 00:44:08,726
with which to write
out textual files

1017
00:44:08,726 --> 00:44:10,366
and it's representative
more generally

1018
00:44:10,366 --> 00:44:12,536
of a topic we'll come back
to in our JavaScript lectures

1019
00:44:12,816 --> 00:44:14,716
on the document object model

1020
00:44:14,716 --> 00:44:16,336
so they'll be some
commonalities there.

1021
00:44:16,496 --> 00:44:20,426
SQL, structured query
language is what's used

1022
00:44:20,426 --> 00:44:23,936
by many relational databases
these days among them MYSQL,

1023
00:44:24,176 --> 00:44:25,966
Oracle, Postgres and others.

1024
00:44:26,146 --> 00:44:29,716
Also in vogue these
days are NoSQL servers,

1025
00:44:29,716 --> 00:44:32,506
document storage engines
which we'll look at later

1026
00:44:32,506 --> 00:44:33,736
in the semester as well.

1027
00:44:33,876 --> 00:44:36,216
But we'll primarily use for
the courses projects MySQL,

1028
00:44:36,216 --> 00:44:39,136
we'll look in Lecture
6 and 7 a JavaScript

1029
00:44:39,246 --> 00:44:41,396
and it is more general
technique of Ajax,

1030
00:44:41,456 --> 00:44:45,096
the ability to use JavaScript
to query a server even

1031
00:44:45,096 --> 00:44:47,536
after a page is loaded
to get back more data

1032
00:44:47,536 --> 00:44:51,116
for instance Google maps,
does this to get more squares

1033
00:44:51,116 --> 00:44:53,426
of mapping information
when you click and drag.

1034
00:44:53,516 --> 00:44:56,156
Facebook does this to
push a live updates

1035
00:44:56,156 --> 00:44:57,576
from your News Feed
and the like.

1036
00:44:57,926 --> 00:44:59,686
We'll look through toward
the end of the semester then

1037
00:45:00,126 --> 00:45:02,606
at some higher level
concepts like security

1038
00:45:02,606 --> 00:45:04,696
which will interlace
throughout the semester

1039
00:45:04,696 --> 00:45:07,516
but we'll really focus on
it in Lecture 8 looking

1040
00:45:07,516 --> 00:45:09,996
at common attacks on web
servers, on web sites,

1041
00:45:09,996 --> 00:45:13,536
on databases, so as to not
necessarily acquaint you

1042
00:45:13,536 --> 00:45:15,716
with everything that
can go wrong but to

1043
00:45:15,716 --> 00:45:17,266
at least plant the
seeds in your mind

1044
00:45:17,266 --> 00:45:19,026
of things you should
be thinking about.

1045
00:45:19,026 --> 00:45:20,776
Indeed there are so
much code out there

1046
00:45:21,046 --> 00:45:23,886
that is just vulnerable because
people don't think do things

1047
00:45:23,886 --> 00:45:26,246
like sanitize user input
that is they don't check it

1048
00:45:26,246 --> 00:45:27,436
for dangerous characters.

1049
00:45:27,716 --> 00:45:30,076
So, we'll talk about things
like SQL injection attacks.

1050
00:45:30,076 --> 00:45:33,816
We'll talk about a cross-site
scripting attacks and any number

1051
00:45:33,816 --> 00:45:37,246
of other ways that are
so darn easy to avoid

1052
00:45:37,486 --> 00:45:39,476
yet many people just
don't realize it

1053
00:45:39,476 --> 00:45:42,746
or don't know how even though
typically simple little function

1054
00:45:42,746 --> 00:45:43,496
calls can fix.

1055
00:45:43,496 --> 00:45:45,856
And then the last lecture,
we'll look at scalability.

1056
00:45:45,856 --> 00:45:48,126
So, it would be a great
problem to have if you've got

1057
00:45:48,156 --> 00:45:52,296
so much traffic that all of the
lessons you learn from lecture 0

1058
00:45:52,296 --> 00:45:56,686
through 8 starts-- your website
starts crumbling under the load

1059
00:45:56,686 --> 00:45:58,526
and so we'll conclude
the semester by looking

1060
00:45:58,526 --> 00:46:01,716
at OK now you have to built
now for a few dozen people,

1061
00:46:01,716 --> 00:46:04,206
a few 100 people at your
school but several 1000

1062
00:46:04,276 --> 00:46:06,776
or maybe even several 1000
people per second how do you

1063
00:46:06,776 --> 00:46:10,266
actually scale from one little
web server to a bigger one

1064
00:46:10,466 --> 00:46:11,866
but then once you
have the biggest

1065
00:46:11,866 --> 00:46:14,176
and most expensive available
web server what do you do

1066
00:46:14,176 --> 00:46:16,566
but you start to scale
as they say horizontally.

1067
00:46:16,666 --> 00:46:19,076
So, you get multiple
servers, maybe even cheaper,

1068
00:46:19,076 --> 00:46:21,606
slower web servers but
you somehow figure out how

1069
00:46:21,606 --> 00:46:24,426
to balance load balance so
to speak traffic across them.

1070
00:46:24,656 --> 00:46:27,786
How do you that with databases
how do you that geographically.

1071
00:46:28,036 --> 00:46:29,846
How do you do that
with cloud computing.

1072
00:46:29,846 --> 00:46:32,006
A buzz word that's all
the rage these days

1073
00:46:32,006 --> 00:46:34,126
but has some very interesting
technologies underlying it.

1074
00:46:34,386 --> 00:46:35,486
We'll wrap up the
semester looking

1075
00:46:35,486 --> 00:46:37,626
at those bigger picture issues.

1076
00:46:38,196 --> 00:46:42,946
In addition to set a lectures,
we will have most weeks sections

1077
00:46:42,946 --> 00:46:46,356
and office hours so the course
has four teaching fellows,

1078
00:46:46,356 --> 00:46:47,326
folks who way either taught

1079
00:46:47,326 --> 00:46:50,626
or taken the course before
who'll be with us in the form

1080
00:46:50,626 --> 00:46:51,926
of sections and office hours,

1081
00:46:51,926 --> 00:46:54,306
sections will be a more slightly
more intimate opportunity

1082
00:46:54,306 --> 00:46:57,436
so on Wednesdays and Mondays
typically right after lecture

1083
00:46:57,436 --> 00:46:59,396
if you'd like to stick around
to dive a little deeper

1084
00:46:59,396 --> 00:47:00,486
into that week's project.

1085
00:47:00,816 --> 00:47:02,926
So in addition to the PDF
specification you'll get

1086
00:47:02,926 --> 00:47:03,496
of a project.

1087
00:47:03,756 --> 00:47:06,596
One of the TFs will walk you
through the week's project

1088
00:47:06,596 --> 00:47:09,666
so it give point--
offer some design tips,

1089
00:47:09,786 --> 00:47:12,016
some helpful direction,
answer any confusion.

1090
00:47:12,016 --> 00:47:13,986
If I do something quite
poorly in lecture,

1091
00:47:13,986 --> 00:47:16,576
we can revisit those kinds
of topics and sections

1092
00:47:16,576 --> 00:47:18,226
so that you get another
perspective altogether

1093
00:47:18,506 --> 00:47:21,646
and then office hours which is
meant to simply follow a section

1094
00:47:21,646 --> 00:47:23,816
so one section officially
wraps after an hour.

1095
00:47:23,946 --> 00:47:27,006
At office hours will be an
opportunity for one on one Q&A

1096
00:47:27,006 --> 00:47:30,126
with one or two of the TFs and
this will be an opportunity

1097
00:47:30,126 --> 00:47:32,656
in particular for
questions with the projects

1098
00:47:32,656 --> 00:47:34,426
if you're having trouble
understanding something

1099
00:47:34,426 --> 00:47:37,026
or trouble chasing down some
bug in addition to reaching

1100
00:47:37,026 --> 00:47:39,346
out to us online we'll have
this in person opportunities

1101
00:47:39,346 --> 00:47:41,516
for those of who are local, for
those of you who are distant,

1102
00:47:41,806 --> 00:47:44,336
more on the online
opportunities in just a moment.

1103
00:47:44,896 --> 00:47:48,906
In addition to the courses'
classes, there are projects,

1104
00:47:48,906 --> 00:47:52,476
three of them and they will
flow roughly in the order

1105
00:47:52,476 --> 00:47:55,286
of the topics on the syllabus
where we'll start in terms

1106
00:47:55,286 --> 00:47:58,446
of PHP which would be new to
some or most people in the room,

1107
00:47:58,706 --> 00:48:01,256
then we'll introduce
mid-semester databases

1108
00:48:01,256 --> 00:48:04,766
and MySQL then we'll introduce
JavaScript and Ajax and so

1109
00:48:04,766 --> 00:48:07,496
that all essentially be
the tripartite approach

1110
00:48:07,796 --> 00:48:10,616
of the course's projects
in terms of the topics.

1111
00:48:10,616 --> 00:48:14,546
In terms of grades, the projects
are graded fairly holistically

1112
00:48:14,546 --> 00:48:16,376
because you'll be
encouraged to make a lot

1113
00:48:16,376 --> 00:48:17,966
of design decisions on your own.

1114
00:48:18,216 --> 00:48:19,306
You won't necessarily have

1115
00:48:19,306 --> 00:48:22,906
to implement precisely what we
tell you to rather you'll have

1116
00:48:22,906 --> 00:48:25,166
to meet certain feature
in technical requirements

1117
00:48:25,496 --> 00:48:28,236
so we'll evaluate the three
projects on this axis.

1118
00:48:28,236 --> 00:48:31,536
So, scope will be in axis that
it will be a numeric score

1119
00:48:31,536 --> 00:48:34,926
that captures how much of the
project you actually attempted.

1120
00:48:35,256 --> 00:48:38,516
Correctness will capture
how much of your code works

1121
00:48:38,516 --> 00:48:40,716
in accordance with the
spec. If it's very buggy,

1122
00:48:40,716 --> 00:48:42,306
that would now be
considered very correct.

1123
00:48:42,676 --> 00:48:45,636
Design is more subjective,
design is OK it might work,

1124
00:48:45,636 --> 00:48:47,526
might work perfectly
but does it look

1125
00:48:47,526 --> 00:48:48,756
like a mess underneath the hood.

1126
00:48:48,756 --> 00:48:50,426
Do you have like 10
nested for loops.

1127
00:48:50,456 --> 00:48:52,316
That is not good
design for instance

1128
00:48:52,506 --> 00:48:53,956
and so design would
be an opportunity

1129
00:48:53,956 --> 00:48:55,886
for particularly
qualitative feedback

1130
00:48:55,996 --> 00:48:57,426
from the teaching
fellows on your code.

1131
00:48:57,726 --> 00:49:00,696
And style is the more
of the aesthetics

1132
00:49:00,696 --> 00:49:02,556
or your variables
reasonably named

1133
00:49:02,556 --> 00:49:04,116
or is your code well common?

1134
00:49:04,166 --> 00:49:05,446
Is it nicely indented?

1135
00:49:05,656 --> 00:49:08,886
The sort of easy things that
are good habits to get into even

1136
00:49:08,886 --> 00:49:11,706
if you're not in them that's
what we define a style

1137
00:49:11,896 --> 00:49:13,916
and then just for
reference, things are weighted

1138
00:49:13,916 --> 00:49:16,466
and roughly the amount
of time that's required

1139
00:49:16,466 --> 00:49:17,426
to get things right.

1140
00:49:17,796 --> 00:49:20,326
So, for instance this
is the formula we'll use

1141
00:49:20,326 --> 00:49:22,326
to compute a total score
for each of the projects,

1142
00:49:22,566 --> 00:49:24,956
where correctness for instance
is weighted more heavily

1143
00:49:25,246 --> 00:49:28,886
than style and that should
capture the fact that, you know,

1144
00:49:28,886 --> 00:49:31,246
indenting your code probably
shouldn't take you all that long

1145
00:49:31,246 --> 00:49:33,756
but chasing down bugs can
certainly take quite a bit

1146
00:49:33,756 --> 00:49:34,076
of time.

1147
00:49:34,246 --> 00:49:36,176
So, the formulas
meant to capture that.

1148
00:49:36,976 --> 00:49:39,286
The course's website
which I'll follow

1149
00:49:39,286 --> 00:49:41,676
up in just a moment has
everything that you will need

1150
00:49:41,676 --> 00:49:43,556
for the course including
videos of lectures

1151
00:49:43,556 --> 00:49:45,476
if you can't make some
evening or if it's tricky

1152
00:49:45,476 --> 00:49:47,246
because of full time
work, it's totally fine

1153
00:49:47,246 --> 00:49:48,956
to watch the course's
lectures online,

1154
00:49:48,956 --> 00:49:51,256
of the handouts similarly
will be available there.

1155
00:49:51,256 --> 00:49:54,456
What will be rolling out over
e-mail this week is access

1156
00:49:54,456 --> 00:49:57,486
to a tool that I actually
we've used in another class

1157
00:49:57,486 --> 00:49:59,696
of mine called CS50 but
it's a discussion tool

1158
00:49:59,696 --> 00:50:02,336
that will allow you to interact
with classmates, with myself,

1159
00:50:02,336 --> 00:50:02,946
with the course's TFs online.

1160
00:50:03,066 --> 00:50:06,916
So, online discussion forms
of sorts but using some

1161
00:50:06,916 --> 00:50:08,576
of the same technologies
that we'll talk

1162
00:50:08,576 --> 00:50:11,546
about in the class
including Ajax and similar.

1163
00:50:11,746 --> 00:50:14,876
So, you will soon receiving
e-mails from us with invitations

1164
00:50:14,876 --> 00:50:17,726
of sorts to create
accounts within the website

1165
00:50:17,986 --> 00:50:20,056
so that we can-- you can
start directing questions

1166
00:50:20,056 --> 00:50:23,816
to classmates or privately to
the staff via CS50 discuss.

1167
00:50:24,226 --> 00:50:26,456
So, any questions
on the structure,

1168
00:50:26,456 --> 00:50:28,986
expectations whether
you're on the right place?

1169
00:50:29,426 --> 00:50:31,256
As to the course itself.

1170
00:50:32,086 --> 00:50:34,366
No. All right.

1171
00:50:34,556 --> 00:50:35,796
In terms of-- yeah.

1172
00:50:36,106 --> 00:50:38,296
>> Is there added
credit for attendance?

1173
00:50:38,296 --> 00:50:42,546
>> Nope. Attendance is
expected and you encourage

1174
00:50:42,606 --> 00:50:44,886
but it's not factored in.

1175
00:50:45,016 --> 00:50:47,736
So, see you at the end of the
semester, probably, right?

1176
00:50:47,986 --> 00:50:51,356
But in terms of lecture,
typically, we're slated for 3:15

1177
00:50:51,356 --> 00:50:54,066
to 6:15, I think, we're
rarely go all three hours.

1178
00:50:54,066 --> 00:50:57,196
Typically, the same course
during the year is two hours

1179
00:50:57,196 --> 00:50:59,386
of class so we'll typically
have a little bit of wiggle room

1180
00:50:59,386 --> 00:51:01,666
and let me not commit to
just two hours per night

1181
00:51:01,946 --> 00:51:04,626
but we will typically not go I
think as many as three hours.

1182
00:51:04,626 --> 00:51:07,746
So, that's frankly a lot to
take in twice a week, no less.

1183
00:51:08,576 --> 00:51:12,756
So, we shall see where
we end up each night.

1184
00:51:13,376 --> 00:51:15,666
Any other questions?

1185
00:51:15,666 --> 00:51:19,506
With regard to sections, the
implication of that detail is

1186
00:51:19,506 --> 00:51:21,516
that sections will
not start necessarily

1187
00:51:21,516 --> 00:51:22,696
at the pre-ordained time.

1188
00:51:22,696 --> 00:51:24,686
What we'll try to do is the
TFs will come a bit early

1189
00:51:24,906 --> 00:51:26,436
so if we're doing the
propping of a lecture,

1190
00:51:26,476 --> 00:51:29,716
we'll take a short break then
dive in right immediately

1191
00:51:29,716 --> 00:51:31,816
to section in office hours

1192
00:51:31,816 --> 00:51:33,876
so that you don't sit here
awkwardly just waiting

1193
00:51:33,876 --> 00:51:35,286
for an arbitrary
time to come around.

1194
00:51:35,516 --> 00:51:35,616
Yeah.

1195
00:51:36,216 --> 00:51:37,976
>> How are we then work
with the distance student?

1196
00:51:38,146 --> 00:51:41,416
>> For the distance students,
sections will be filmed as well

1197
00:51:41,486 --> 00:51:44,796
and we will be making ample
use of online interactions

1198
00:51:44,796 --> 00:51:46,496
for students who are
primarily distance

1199
00:51:46,946 --> 00:51:49,936
and we've also experimented in
the past with things like Skype

1200
00:51:49,936 --> 00:51:52,246
and video conferencing
or online chats.

1201
00:51:52,246 --> 00:51:53,306
We're quite flexible

1202
00:51:53,356 --> 00:51:55,826
for whatever works
pedagogically for folks,

1203
00:51:55,886 --> 00:51:57,026
>> [Inaudible] distance students

1204
00:51:57,076 --> 00:52:00,376
to attend the section
via chat or something?

1205
00:52:00,376 --> 00:52:00,906
>> Good question.

1206
00:52:01,126 --> 00:52:03,156
Typically, not for distance
students with sections.

1207
00:52:03,156 --> 00:52:05,086
We do film them but
there is some latency

1208
00:52:05,086 --> 00:52:07,566
and when we post them, we
may experiment with trying

1209
00:52:07,566 --> 00:52:09,846
to stream something's online
but this room is not equipped

1210
00:52:09,846 --> 00:52:12,946
for that so, I shouldn't make
promises to that just yet.

1211
00:52:13,596 --> 00:52:15,486
But either way things will
be available asynchronously

1212
00:52:15,486 --> 00:52:15,946
after the fact.

1213
00:52:16,326 --> 00:52:16,406
Yeah.

1214
00:52:17,136 --> 00:52:17,966
>> What are the office hours?

1215
00:52:18,556 --> 00:52:20,416
>> The office hours
will typically be right

1216
00:52:20,596 --> 00:52:22,646
after sections on
Mondays and Wednesdays

1217
00:52:22,646 --> 00:52:24,736
which are right after lectures.

1218
00:52:25,096 --> 00:52:27,386
The motivation being especially
for folks who commute,

1219
00:52:27,646 --> 00:52:29,946
we'd figure we try to compact
things to Mondays and Wednesday

1220
00:52:29,946 --> 00:52:31,616
so you don't have to
come to campus yet again.

1221
00:52:32,046 --> 00:52:32,976
And we're flexible too.

1222
00:52:32,976 --> 00:52:35,346
If for instance, you're really
struggling in the class,

1223
00:52:35,346 --> 00:52:37,686
you get lots of questions
or your schedule,

1224
00:52:37,686 --> 00:52:39,586
you have a night time
class right afterward,

1225
00:52:39,776 --> 00:52:41,896
we're happy to do things
by appointment as well.

1226
00:52:42,946 --> 00:52:45,026
So, in short, we'll meet
you halfway as best we can.

1227
00:52:46,356 --> 00:52:46,746
All right.

1228
00:52:46,996 --> 00:52:48,306
So, web hosting companies.

1229
00:52:48,306 --> 00:52:51,116
We talked earlier about DNS
and sort of getting traffic

1230
00:52:51,196 --> 00:52:53,996
to some destination B
but once they get there,

1231
00:52:53,996 --> 00:52:55,326
what's waiting for the user.

1232
00:52:55,326 --> 00:52:59,206
Where are your HTML and CSS and
soon, PHP files actually store?

1233
00:52:59,516 --> 00:53:02,036
So, this is a little screen shot
from this one company DreamHost

1234
00:53:02,036 --> 00:53:04,376
and I don't necessarily
recommend them over any others

1235
00:53:04,616 --> 00:53:06,976
but they're popular and
well-known and super cheap.

1236
00:53:06,976 --> 00:53:08,456
And just to give you a
sense of what you get

1237
00:53:09,106 --> 00:53:11,666
and what you don't really
get, here is a screen shot

1238
00:53:11,666 --> 00:53:14,606
of what you get for
apparently for 8.98 per month.

1239
00:53:14,936 --> 00:53:17,196
So, you apparently get
unlimited terabytes

1240
00:53:17,586 --> 00:53:18,876
of disks storage space.

1241
00:53:19,586 --> 00:53:22,216
You get unlimited terabytes
of monthly bandwidth.

1242
00:53:22,266 --> 00:53:24,796
You get an unlimited
number of domains hosted,

1243
00:53:24,796 --> 00:53:27,676
you get an unlimited number of
user accounts, e-mail accounts,

1244
00:53:27,676 --> 00:53:30,926
MySQL databases and a
particular distribution

1245
00:53:30,926 --> 00:53:32,186
of Linux called Debian here.

1246
00:53:32,656 --> 00:53:34,256
So, it kind of sounds
too good to be true.

1247
00:53:34,696 --> 00:53:35,746
So, what is the catch here?

1248
00:53:36,036 --> 00:53:38,506
Like that's an amazing
deal for 8.95 a month.

1249
00:53:39,016 --> 00:53:39,886
Unlimited everything.

1250
00:53:41,376 --> 00:53:42,936
So, what are some of the catches

1251
00:53:42,936 --> 00:53:44,666
or what are they doing
here technologically

1252
00:53:44,666 --> 00:53:45,706
to make this possible?

1253
00:53:45,926 --> 00:53:49,576
>> Well, they're not
alone on their own server.

1254
00:53:49,576 --> 00:53:51,586
They're sharing it a
bunch of other sites.

1255
00:53:51,586 --> 00:53:52,166
>> Exactly.

1256
00:53:52,166 --> 00:53:55,586
So, a lot of these web hosting
companies are shared services

1257
00:53:55,586 --> 00:53:57,386
whereby you might get this

1258
00:53:57,726 --> 00:53:59,626
but they're also promising
the exact same thing

1259
00:53:59,626 --> 00:54:01,906
to 10 other people,
to 100 other people.

1260
00:54:02,166 --> 00:54:04,056
Now, it turns out that in HTTP,

1261
00:54:04,106 --> 00:54:06,026
the protocol we discussed
earlier,

1262
00:54:06,396 --> 00:54:07,926
there is a feature these days

1263
00:54:07,926 --> 00:54:09,366
for what's called
virtual hosting.

1264
00:54:09,666 --> 00:54:11,146
So, back in the day for the web,

1265
00:54:11,146 --> 00:54:14,286
every website needed a unique
IP address, essentially.

1266
00:54:14,356 --> 00:54:16,776
So, that when you typed in
something dot com you went

1267
00:54:16,776 --> 00:54:19,256
to one website and that
website lived on a server,

1268
00:54:19,256 --> 00:54:20,526
that server had an IP address.

1269
00:54:20,586 --> 00:54:21,866
And if you wanted
a second website,

1270
00:54:22,176 --> 00:54:24,206
you better get a second
server or at least give

1271
00:54:24,206 --> 00:54:25,706
that computer a second
IP address.

1272
00:54:26,036 --> 00:54:28,436
However, in more
recent versions of HTTP,

1273
00:54:28,866 --> 00:54:30,546
we'll see through
some experimentation

1274
00:54:30,546 --> 00:54:34,886
with actual browsers, browsers
send another HTTP header.

1275
00:54:34,886 --> 00:54:35,846
They don't just send gets.

1276
00:54:36,186 --> 00:54:38,636
They also send a
reminder to the web server

1277
00:54:38,636 --> 00:54:40,616
as to what the user
typed in to the URL

1278
00:54:40,946 --> 00:54:44,166
so that you can now have
these days multiple websites.

1279
00:54:44,496 --> 00:54:46,566
Food.com, bar.com, bus.com,

1280
00:54:46,566 --> 00:54:48,476
all living on the
same physical server,

1281
00:54:48,716 --> 00:54:50,686
at the exact same IP address

1282
00:54:51,176 --> 00:54:55,086
and because the browsers remind
the server what the user typed

1283
00:54:55,086 --> 00:54:58,956
in, food.com, or, bar.com
or bus.com, the server,

1284
00:54:59,086 --> 00:54:59,846
even though its receiving

1285
00:54:59,846 --> 00:55:02,056
for three different
websites can figure

1286
00:55:02,056 --> 00:55:05,486
out from those so-called headers
what was requested and then,

1287
00:55:05,576 --> 00:55:08,566
return the appropriate
domains homepage.

1288
00:55:08,946 --> 00:55:14,116
So, in this case, that's great
because it makes this possible.

1289
00:55:14,116 --> 00:55:16,176
We only have 4 billion
IP addresses in the world

1290
00:55:16,176 --> 00:55:18,676
and they are legitimately
running out and so,

1291
00:55:18,676 --> 00:55:21,586
this is great that we can
multiplex servers in this way

1292
00:55:21,586 --> 00:55:24,186
and put multiple
people, multiple websites

1293
00:55:24,186 --> 00:55:25,416
on the same IP address.

1294
00:55:25,816 --> 00:55:28,316
But there's a couple of gotchas,
what's the implication of this,

1295
00:55:29,006 --> 00:55:30,906
the fact that multiple customers
are on the same machine?

1296
00:55:31,266 --> 00:55:33,586
>> Well, if one-- well if
the server crashes and all

1297
00:55:33,586 --> 00:55:37,786
of the websites will go down.

1298
00:55:37,976 --> 00:55:39,466
>> Good. So, if the
machine crashes,

1299
00:55:39,466 --> 00:55:41,806
now all of you are affected
rather than just the one.

1300
00:55:42,226 --> 00:55:43,636
>> Contention for resources.

1301
00:55:43,986 --> 00:55:45,656
>> Contention for
resources, right?

1302
00:55:45,696 --> 00:55:49,546
So, you're kind of in bad luck--
a bad place, if for instance,

1303
00:55:49,546 --> 00:55:53,116
one of the other customers on
the web server is Facebook.com

1304
00:55:53,116 --> 00:55:56,256
or something that achieve
unexpected popularity all

1305
00:55:56,256 --> 00:55:59,126
of a sudden or maybe it's a
website that's really ticked

1306
00:55:59,126 --> 00:56:01,116
someone off and is
getting some kind

1307
00:56:01,186 --> 00:56:03,226
of internet biz [phonetic]
attack like a denial

1308
00:56:03,226 --> 00:56:05,916
of service attack because people
are going after that website

1309
00:56:05,916 --> 00:56:07,696
and just because your server--

1310
00:56:07,696 --> 00:56:10,806
your website is on the same
server, now, you are down

1311
00:56:10,806 --> 00:56:12,546
or otherwise offline as well.

1312
00:56:12,766 --> 00:56:13,846
Moreover, one of the--

1313
00:56:13,846 --> 00:56:17,846
one of the ways in which
these companies offer

1314
00:56:17,846 --> 00:56:20,166
such discounted prices is
because it's not just you

1315
00:56:20,166 --> 00:56:20,976
and two other websites.

1316
00:56:20,976 --> 00:56:22,726
It's probably not
10, it could be 100,

1317
00:56:22,726 --> 00:56:26,826
it could be 1000 other customers
on the same server and so,

1318
00:56:26,826 --> 00:56:28,226
there must be some fine print.

1319
00:56:28,226 --> 00:56:30,106
And hopefully, there are
some fine print somewhere

1320
00:56:30,326 --> 00:56:32,816
that does say this is subject
to something or rather, right?

1321
00:56:32,816 --> 00:56:35,236
They don't have infinite
terabytes on their web server.

1322
00:56:35,236 --> 00:56:36,596
They don't have infinite
bandwidth.

1323
00:56:36,596 --> 00:56:38,896
There's got to be some
catch here, otherwise,

1324
00:56:38,896 --> 00:56:41,176
the world will not pay
$1000 a month to host

1325
00:56:41,176 --> 00:56:43,276
to real large scale websites.

1326
00:56:43,636 --> 00:56:45,896
So, you, again, sort of
get what you pay for.

1327
00:56:45,896 --> 00:56:47,106
And this is actually expensive.

1328
00:56:47,106 --> 00:56:49,566
Years ago, I signed up for
some fly by not operation

1329
00:56:49,566 --> 00:56:54,366
for like 2.95 a year to host
my website and it was a website

1330
00:56:54,366 --> 00:56:56,846
that I did not care much
for and that was good

1331
00:56:56,846 --> 00:56:58,046
because it went down
quite a bit.

1332
00:56:58,776 --> 00:57:01,116
So, what they're not
guaranteeing here is unlimited

1333
00:57:01,116 --> 00:57:02,346
uptime for instance.

1334
00:57:02,536 --> 00:57:04,416
So, there is some gotchas.

1335
00:57:04,636 --> 00:57:06,666
But frankly, if you're just
starting small, you just want

1336
00:57:06,666 --> 00:57:09,266
to experiment, you need a
place for testing a website

1337
00:57:09,266 --> 00:57:13,616
or you don't-- your 8.95 is more
compelling than several $100

1338
00:57:13,616 --> 00:57:17,306
or even more, this is
certainly quite compelling.

1339
00:57:17,756 --> 00:57:18,546
All right.

1340
00:57:18,886 --> 00:57:21,256
But as an aside things
like e-mail and calendar

1341
00:57:21,256 --> 00:57:22,736
and whatnot there's
other alternatives.

1342
00:57:22,736 --> 00:57:24,506
You don't need to get
those through your web post

1343
00:57:24,506 --> 00:57:26,116
when places like Google exist.

1344
00:57:26,526 --> 00:57:30,156
But suppose, you are not so
comfortable with that approach

1345
00:57:30,156 --> 00:57:33,016
and you suppose too that
you're not comfortable also

1346
00:57:33,016 --> 00:57:35,986
with the fact that you
do not have any control

1347
00:57:35,986 --> 00:57:39,116
over a DreamHost like server
because it's being shared

1348
00:57:39,116 --> 00:57:40,876
by other people and it's
because it's being managed

1349
00:57:40,876 --> 00:57:44,876
by other people which is to
say, if they are running PHP 5.2

1350
00:57:44,936 --> 00:57:46,046
which is a few years old,

1351
00:57:46,686 --> 00:57:49,206
so are you like you
are running PHP 5.2.

1352
00:57:49,206 --> 00:57:50,596
And if you want to
take advantage

1353
00:57:50,596 --> 00:57:53,546
of new language features that
were introduced in PHP 5.3

1354
00:57:53,696 --> 00:57:56,426
and more recently PHP
5.4, you're out of luck

1355
00:57:56,536 --> 00:57:57,226
like you're going to have

1356
00:57:57,226 --> 00:57:59,106
to other fund a new web
post or just deal with it.

1357
00:57:59,106 --> 00:58:01,026
You can't just install
it yourself typically.

1358
00:58:01,356 --> 00:58:03,906
So, it's similarly can you
not upgrade different versions

1359
00:58:03,906 --> 00:58:04,336
of software.

1360
00:58:04,336 --> 00:58:06,726
You can't necessarily
reconfigure the web server

1361
00:58:06,726 --> 00:58:07,366
at will.

1362
00:58:07,566 --> 00:58:09,376
Now, they might give
you some form of control

1363
00:58:09,376 --> 00:58:10,846
but you'll reach a point perhaps

1364
00:58:11,226 --> 00:58:12,426
where it's just too
frustrating not

1365
00:58:12,426 --> 00:58:14,296
to have administrative
access to the server.

1366
00:58:14,636 --> 00:58:16,756
So, you can still achieve that.

1367
00:58:16,756 --> 00:58:20,316
So, virtual or private
servers, VPSs or an alternative

1368
00:58:20,316 --> 00:58:21,726
to a shared web hosting model.

1369
00:58:22,316 --> 00:58:25,686
In the VPS world, you
get a dedicated server

1370
00:58:25,686 --> 00:58:27,756
to yourself and sort of.

1371
00:58:28,206 --> 00:58:31,006
You get the illusion of a
dedicated server to yourself.

1372
00:58:31,006 --> 00:58:33,196
So, thanks to a technology,
generically known

1373
00:58:33,196 --> 00:58:37,306
as virtualization, these
days, you can buy a server

1374
00:58:37,376 --> 00:58:40,656
with like a bunch of CPUs or a
bunch of course, lots of RAM,

1375
00:58:40,766 --> 00:58:44,536
lots of disk space and then, you
can run virtualization software

1376
00:58:44,536 --> 00:58:47,336
on it that is something
known generally as hypervisor

1377
00:58:47,596 --> 00:58:51,306
like VMware or Parallels
or VirtualBox.

1378
00:58:51,306 --> 00:58:53,706
There's a whole bunch of these
products, free and commercial

1379
00:58:53,706 --> 00:58:56,986
or like out there that once
you run them and install them

1380
00:58:56,986 --> 00:59:00,666
on a server, on top of that
software, you can then,

1381
00:59:00,666 --> 00:59:03,736
install multiple instances of
Windows, multiple instances

1382
00:59:03,736 --> 00:59:06,576
of Linux, multiple instances,
if they allowed it, of Mac OS.

1383
00:59:07,226 --> 00:59:08,516
So you could create the illusion

1384
00:59:08,516 --> 00:59:11,096
of multiple distinct
computers each

1385
00:59:11,096 --> 00:59:12,896
of which has its own
user names and passwords,

1386
00:59:12,896 --> 00:59:15,536
its own administrative or
so-called root account.

1387
00:59:15,936 --> 00:59:18,186
And even though, they're
sharing the physical hardware,

1388
00:59:18,426 --> 00:59:20,946
they are not sharing
the same software.

1389
00:59:21,096 --> 00:59:24,186
So, what you would get as the
customer is the root log in

1390
00:59:24,186 --> 00:59:26,366
or the administrator
log in to your machine.

1391
00:59:26,576 --> 00:59:29,116
Now, there's still the
risk of resource contention

1392
00:59:29,296 --> 00:59:31,066
because these players
too were typically

1393
00:59:31,066 --> 00:59:33,806
over provision especially if
you're spending, you know,

1394
00:59:33,806 --> 00:59:38,466
9.95 a month and not a 159.95
a month, you're probably going

1395
00:59:38,466 --> 00:59:40,506
to be on a server
with fewer resources

1396
00:59:40,506 --> 00:59:41,716
or with more customers.

1397
00:59:41,966 --> 00:59:43,526
But at least here,
you gain something.

1398
00:59:44,446 --> 00:59:45,876
And if you've been
following along,

1399
00:59:46,166 --> 00:59:48,306
what is that fundamentally
you're gaining from a VPS

1400
00:59:48,306 --> 00:59:50,066
that you didn't get
from a web post?

1401
00:59:50,106 --> 00:59:54,686
>> You can choose to update
things, however you wish.

1402
00:59:54,686 --> 00:59:55,216
>> Exactly.

1403
00:59:55,216 --> 00:59:55,616
>> -- to update.

1404
00:59:55,856 --> 00:59:56,386
>> Control.

1405
00:59:56,386 --> 00:59:57,906
You can keep things up to date,

1406
00:59:57,906 --> 01:00:00,006
you can install whatever
you want and also,

1407
01:00:00,276 --> 01:00:04,356
if someone else's server is
compromised odds are your is

1408
01:00:04,626 --> 01:00:06,196
might not be whereas

1409
01:00:06,196 --> 01:00:09,176
of web hosting server is
compromised everything

1410
01:00:09,176 --> 01:00:11,756
on that server is
potentially vulnerable.

1411
01:00:12,496 --> 01:00:14,346
So still not perfect
because the reality is

1412
01:00:14,346 --> 01:00:17,496
to even know you are the
only one now with root

1413
01:00:17,496 --> 01:00:20,846
or administrative access because
it is a dedicated albeit virtual

1414
01:00:20,846 --> 01:00:22,846
server for you that
to you is kind

1415
01:00:22,846 --> 01:00:24,436
of a white lie who
else has access?

1416
01:00:25,416 --> 01:00:26,116
>> The people there.

1417
01:00:26,436 --> 01:00:27,196
>> The people there.

1418
01:00:27,586 --> 01:00:30,966
Even if they don't know your
password they have physical

1419
01:00:30,966 --> 01:00:32,416
access to the machine
and as soon as you

1420
01:00:32,416 --> 01:00:34,106
as your physical
access to any machine

1421
01:00:34,176 --> 01:00:36,636
in the world pretty much
you can compromise it.

1422
01:00:36,636 --> 01:00:38,836
You can boot-- you know, Linux
computer for instance can booted

1423
01:00:38,836 --> 01:00:40,296
in what's called
single user mode

1424
01:00:40,586 --> 01:00:43,046
by pretty much hitting the
letter S when it's booting up

1425
01:00:43,046 --> 01:00:45,496
and that circumvents
any request for password

1426
01:00:45,496 --> 01:00:46,696
at which point you
can even change it.

1427
01:00:46,696 --> 01:00:50,356
Even on PCs and computers
you can usually reset certain

1428
01:00:50,356 --> 01:00:53,796
passwords by opening the case up
putting a little metal connector

1429
01:00:53,796 --> 01:00:55,396
on two pins and it
short circuits

1430
01:00:55,396 --> 01:00:57,086
out the password
and clears it out.

1431
01:00:57,456 --> 01:00:59,646
So in short physical
access bad for security

1432
01:00:59,886 --> 01:01:03,286
so you're not gaining more
security fundamentally,

1433
01:01:03,456 --> 01:01:05,336
you're just making
it less likely

1434
01:01:05,336 --> 01:01:08,296
that someone else is
compromised will affect you.

1435
01:01:08,546 --> 01:01:10,956
And in some of these
system some software

1436
01:01:11,196 --> 01:01:14,716
that system administrators
will have the password

1437
01:01:14,716 --> 01:01:17,586
or at least access to the
root account on your server

1438
01:01:17,806 --> 01:01:20,846
so in short you should just
assume that this is for you

1439
01:01:21,126 --> 01:01:23,826
but it probably at least one
other person could physically

1440
01:01:23,826 --> 01:01:25,396
access your contents.

1441
01:01:26,616 --> 01:01:29,286
So what do you get though
for the money, well here

1442
01:01:29,286 --> 01:01:31,376
and frankly these numbers
are a little more compelling

1443
01:01:31,376 --> 01:01:33,526
because is not unlimited
so I'm kind of incline

1444
01:01:33,526 --> 01:01:35,496
to believe a bit more
about the quality

1445
01:01:35,496 --> 01:01:38,256
of service we're getting here
but 20 gigabytes of storage,

1446
01:01:38,256 --> 01:01:38,796
you know, that's fine

1447
01:01:38,796 --> 01:01:42,266
for typical website unless your
website has a ridiculous amount

1448
01:01:42,266 --> 01:01:45,366
of traffic and database traffic
and logs which could build up

1449
01:01:45,366 --> 01:01:47,386
and start taking megs
or gigabytes of space.

1450
01:01:47,706 --> 01:01:50,206
Or if you allow users
to upload files

1451
01:01:50,206 --> 01:01:52,376
or photos then you might
need a lot of space

1452
01:01:52,606 --> 01:01:55,306
but many websites even if
they are dynamic this is

1453
01:01:55,306 --> 01:01:56,266
probably plenty.

1454
01:01:56,606 --> 01:02:00,196
Transfer is an interesting one,
20, 200 gigabytes per month,

1455
01:02:00,746 --> 01:02:03,326
for most websites that's
probably fine unless your

1456
01:02:03,326 --> 01:02:04,806
website is photo website

1457
01:02:04,806 --> 01:02:08,156
or worst a video website then
you have to start to do the math

1458
01:02:08,446 --> 01:02:10,236
and figure out exactly
how much trend--

1459
01:02:10,236 --> 01:02:12,786
data will be coming in and
out of your server based

1460
01:02:12,786 --> 01:02:15,726
on users patterns and moreover
there's also corner cases

1461
01:02:15,726 --> 01:02:17,326
as we'll discuss toward
the end of the semester,

1462
01:02:17,326 --> 01:02:19,006
you got to worry about
the bad guys out there.

1463
01:02:19,266 --> 01:02:21,416
If someone just doesn't
like you or is bored

1464
01:02:21,416 --> 01:02:24,616
or downloads some free piece
of software that bangs up--

1465
01:02:24,616 --> 01:02:26,766
bangs the heck out of your
computer they could just eat

1466
01:02:26,766 --> 01:02:29,266
up your monthly allotment
of bandwidth just

1467
01:02:29,266 --> 01:02:30,936
by sending bogus traffic

1468
01:02:30,936 --> 01:02:33,596
or downloading the same video
again and again and again.

1469
01:02:34,016 --> 01:02:36,416
So there's very interesting
adversarial tax

1470
01:02:36,706 --> 01:02:40,846
when you have finances somehow
tied to usage so you need

1471
01:02:40,846 --> 01:02:42,706
to be aware of that
especially with cloud computing.

1472
01:02:42,706 --> 01:02:46,956
And let's see you get some
amount of RAM 512 megabytes here

1473
01:02:47,156 --> 01:02:48,576
and so forth one of
the things we'll look

1474
01:02:48,576 --> 01:02:50,176
at during the semester
is we start playing

1475
01:02:50,176 --> 01:02:51,996
with Apache is it will
give you the sense

1476
01:02:52,056 --> 01:02:55,286
of how you can asses how much
RAM your computer is using how

1477
01:02:55,286 --> 01:02:56,606
much disk space its using.

1478
01:02:56,926 --> 01:02:58,816
I dare say one of the
most of common platforms

1479
01:02:58,816 --> 01:03:01,666
for web hosting these
days whether it's a VPS

1480
01:03:01,666 --> 01:03:05,856
or it is a shared
web post is Linux

1481
01:03:05,856 --> 01:03:10,166
in some form whether it's Debian
or Fedora or Ubuntu or Red Hat

1482
01:03:10,166 --> 01:03:12,586
or CentOS or any number--
versions of Linux.

1483
01:03:12,856 --> 01:03:15,856
We'll happen to use Fedora in
the class but its representative

1484
01:03:15,856 --> 01:03:18,456
of many similar operating
systems.

1485
01:03:18,726 --> 01:03:21,556
You can use Mac OS but it's
not really use commercially

1486
01:03:21,556 --> 01:03:22,606
through host websites just

1487
01:03:22,606 --> 01:03:25,056
because is not really
geared toward that.

1488
01:03:25,346 --> 01:03:28,506
You can use Windows but you
really-- there's no good reason,

1489
01:03:28,786 --> 01:03:31,136
there's no technically
compelling reason

1490
01:03:31,136 --> 01:03:33,866
to use a Windows machine to
host things like PHP or Python

1491
01:03:33,866 --> 01:03:36,286
or Ruby because you're paying
money for Windows license

1492
01:03:36,286 --> 01:03:39,096
to run free software so it's not
necessarily compelling unless

1493
01:03:39,096 --> 01:03:40,266
you already have the licenses

1494
01:03:40,266 --> 01:03:42,896
and have the machines
generally going

1495
01:03:42,896 --> 01:03:47,076
with these open source tools
is quite common and compelling

1496
01:03:47,076 --> 01:03:48,426
because none of the
software were used

1497
01:03:48,426 --> 01:03:50,826
in the course costs
any money or whatsoever

1498
01:03:51,036 --> 01:03:53,416
and it's nonetheless
quite popular and robust.

1499
01:03:54,156 --> 01:03:58,886
All right so what
else, the appliance,

1500
01:03:59,036 --> 01:04:01,306
so we won't introduce it
tonight but we will in the form

1501
01:04:01,306 --> 01:04:04,506
of the first project so that you
have an experience in the class

1502
01:04:04,666 --> 01:04:06,576
that is as realistic
as possible.

1503
01:04:06,576 --> 01:04:07,576
What we'll actually have each

1504
01:04:07,576 --> 01:04:09,816
of you do is run
your own web server

1505
01:04:09,816 --> 01:04:11,406
and run your own
data base server

1506
01:04:11,406 --> 01:04:13,736
and actually run your
own copy of Linux itself.

1507
01:04:14,016 --> 01:04:16,196
For this we'll use another
tool that I used in other class

1508
01:04:16,196 --> 01:04:17,716
of mine called the
CS50 Appliance,

1509
01:04:17,856 --> 01:04:20,876
and this will be a
downloadable file that inside

1510
01:04:20,876 --> 01:04:22,766
of which is an installation
of Linux,

1511
01:04:22,766 --> 01:04:25,636
Fedora Linux specifically,
but also installed for you

1512
01:04:25,636 --> 01:04:27,256
in advance will be a Apache

1513
01:04:27,256 --> 01:04:29,126
which is web server
software MySQL

1514
01:04:29,126 --> 01:04:32,976
which is data base software, PHP
as well as support for bunches

1515
01:04:32,976 --> 01:04:35,306
of other languages and
standard tools and the like,

1516
01:04:35,596 --> 01:04:37,686
and the upside of this is that
rather than have you connect

1517
01:04:37,686 --> 01:04:39,886
for instance to some
random Harvard servers

1518
01:04:39,886 --> 01:04:42,126
on which you'll only
have temporary access,

1519
01:04:42,506 --> 01:04:44,556
this is a virtual machine
that you have on your computer

1520
01:04:44,556 --> 01:04:46,106
for as long as you
want to keep it around

1521
01:04:46,106 --> 01:04:48,466
and it's very representative
of the configuration.

1522
01:04:48,696 --> 01:04:51,616
You would find out of VPS
or the commercial web post.

1523
01:04:51,616 --> 01:04:54,206
And because you'll
have root access on it

1524
01:04:54,206 --> 01:04:55,306
and because it will be leave

1525
01:04:55,306 --> 01:04:58,256
on your machine only you
have perfectly secure access

1526
01:04:58,256 --> 01:05:01,556
to it unless your laptop
or desktop is compromised

1527
01:05:01,756 --> 01:05:05,236
and you'll be able to configure
Apache and PHP and really tinker

1528
01:05:05,236 --> 01:05:08,786
with things and best get--
best yet if screwed it up,

1529
01:05:08,886 --> 01:05:11,136
that's fine, just download a new
one and you're back to a sort

1530
01:05:11,136 --> 01:05:13,286
of the beginning so long
as you've saved your code

1531
01:05:13,286 --> 01:05:15,746
at somewhere which
will encourage you how

1532
01:05:15,746 --> 01:05:16,626
and where to do.

1533
01:05:16,956 --> 01:05:20,076
So more than that in a week
or so, how would you connect

1534
01:05:20,076 --> 01:05:21,876
to this kind of thing, so SSH.

1535
01:05:21,876 --> 01:05:25,636
Does anyone use-- Does
anyone not use SSH here?

1536
01:05:25,696 --> 01:05:28,416
There is going to be some hands?

1537
01:05:29,356 --> 01:05:32,346
OK. So if you haven't used
SSH, stands for secure shell

1538
01:05:32,586 --> 01:05:35,296
and this is a way,
sort of old school way

1539
01:05:35,296 --> 01:05:38,836
but now a much more secure of
connecting to a remote server

1540
01:05:38,836 --> 01:05:40,236
and executing commands on it.

1541
01:05:40,236 --> 01:05:43,706
So this is just a free program
that comes with MacOS Terminal

1542
01:05:44,036 --> 01:05:46,666
and there is analogs for the
windows or a patti [phonetic].

1543
01:05:46,666 --> 01:05:49,226
It's a free program for windows
that a lot of people like to use

1544
01:05:49,516 --> 01:05:51,376
and it allows you to open
up essentially a black

1545
01:05:51,376 --> 01:05:54,816
and white window or white
and black window and type

1546
01:05:54,816 --> 01:05:57,676
in a username and password,
connect to some remote server

1547
01:05:57,756 --> 01:05:59,306
and execute commands on it.

1548
01:05:59,306 --> 01:06:01,246
And those commands can
be to create files,

1549
01:06:01,246 --> 01:06:04,296
remove files configure the
web server, turn the database

1550
01:06:04,296 --> 01:06:05,646
on or off for the like.

1551
01:06:06,026 --> 01:06:08,866
So what you'll find, once we
start using this CS50 appliance

1552
01:06:08,866 --> 01:06:10,516
or virtual machine,
albeit running

1553
01:06:10,516 --> 01:06:13,666
on your own computer not some
server is you will be able

1554
01:06:13,666 --> 01:06:17,286
to connect to that appliance
as though it's a remote server.

1555
01:06:17,666 --> 01:06:20,616
So that you never even need
to see the appliance itself.

1556
01:06:20,616 --> 01:06:23,286
Literally, once you turn it on,
you can minimize it and pretend

1557
01:06:23,286 --> 01:06:24,966
that it's a server
somewhere else in the internet

1558
01:06:25,136 --> 01:06:26,766
because once you install
it, the appliance,

1559
01:06:27,336 --> 01:06:30,196
a.k.a. virtual machine is going
to have its own IP address

1560
01:06:30,716 --> 01:06:32,946
but it's going to be
what type of IP address?

1561
01:06:33,796 --> 01:06:36,366
So it will be private
to your own laptops

1562
01:06:36,366 --> 01:06:38,576
and no one else can even access
it but you'll be able to go

1563
01:06:38,576 --> 01:06:41,146
through precisely the same
motions that you would

1564
01:06:41,186 --> 01:06:44,026
if you're actually paying some
third party to host your website

1565
01:06:44,026 --> 01:06:46,936
or if you own some server
else where on the internet.

1566
01:06:46,936 --> 01:06:49,036
So SSH will be one of the
techniques that we use.

1567
01:06:49,336 --> 01:06:52,106
SFTP for those unfamiliar,
this is screen shot

1568
01:06:52,106 --> 01:06:55,206
of a popular windows client
for transferring files,

1569
01:06:55,206 --> 01:06:58,006
called secure effects but others
exist, free ones in particular,

1570
01:06:58,006 --> 01:07:01,086
it just let's you drag and
drop files from your computer

1571
01:07:01,086 --> 01:07:03,866
to a server but in this
case, the server is going

1572
01:07:03,866 --> 01:07:06,276
to be a virtual machine
running on your own computer

1573
01:07:06,276 --> 01:07:08,676
that maybe-- maybe or
maybe not is minimized.

1574
01:07:08,926 --> 01:07:13,176
But again, the experience
would be precisely the same.

1575
01:07:13,366 --> 01:07:15,576
So where does that leave us?

1576
01:07:15,936 --> 01:07:20,096
So it turns out when
you are writing HTML,

1577
01:07:20,886 --> 01:07:24,786
you have fairly static content
but you do have these mechanisms

1578
01:07:24,786 --> 01:07:27,536
and I'm guessing most people
in the room have some or a lot

1579
01:07:27,536 --> 01:07:31,236
of experience with HTML and
basic websites and the like.

1580
01:07:31,466 --> 01:07:35,086
But these ultimately are
the basic input mechanism

1581
01:07:35,086 --> 01:07:38,076
by which we can start
making dynamic websites.

1582
01:07:38,076 --> 01:07:40,826
In other words, we have text
fields, password fields,

1583
01:07:40,826 --> 01:07:43,706
hidden fields, checkboxes,
radio buttons, drop down menus,

1584
01:07:43,906 --> 01:07:47,386
so these are the mechanisms by
which we can start to get input

1585
01:07:47,386 --> 01:07:49,996
from users so that when they
interact with our website,

1586
01:07:49,996 --> 01:07:52,766
they don't necessarily see the
same thing rather they might see

1587
01:07:52,766 --> 01:07:55,496
different things every
time we visit that website.

1588
01:07:55,496 --> 01:07:57,746
So let's do a little
example here.

1589
01:07:58,166 --> 01:07:59,716
I'm going to go ahead and--

1590
01:07:59,936 --> 01:08:01,306
don't download this
on your own just

1591
01:08:01,306 --> 01:08:04,256
yet because we will be
posting a newer version soon--

1592
01:08:04,616 --> 01:08:09,696
but I'm going to go ahead and
open up a program called fusion,

1593
01:08:10,406 --> 01:08:13,176
VMWare fusion, this is what's
called generally again a

1594
01:08:13,176 --> 01:08:16,146
hypervisor virtualization
software,

1595
01:08:16,676 --> 01:08:19,416
and what I've just done
is essentially run Linux

1596
01:08:19,416 --> 01:08:21,706
on my own Mac and
you too can do this.

1597
01:08:21,706 --> 01:08:23,726
I'm going to actually use
the Linux desktop just

1598
01:08:23,726 --> 01:08:26,236
because it's here but I could
similarly minimize it as I will

1599
01:08:26,236 --> 01:08:28,266
in just a few minutes and
we'll connect to it as well.

1600
01:08:28,586 --> 01:08:31,006
So now, I'm running
Linux on my same computer

1601
01:08:31,006 --> 01:08:32,556
and notice I currently
have no IP address

1602
01:08:32,556 --> 01:08:35,046
but this should change in a
few seconds once I get a--

1603
01:08:35,046 --> 01:08:35,996
there we go.

1604
01:08:36,366 --> 01:08:40,096
So my Linux computer virtual
machine has just asked the

1605
01:08:40,096 --> 01:08:42,276
network to give me an
IP address and it came.

1606
01:08:42,576 --> 01:08:44,476
The protocol that computers used

1607
01:08:44,476 --> 01:08:49,716
to get IP addresses
dynamically, does anyone know?

1608
01:08:49,716 --> 01:08:52,466
DHCP, Dynamic Host Configuration
Protocol, that's what did that.

1609
01:08:52,636 --> 01:08:55,036
It's also how your own
personal computer works at home

1610
01:08:55,036 --> 01:08:57,216
and gets an IP address
from your Linksys router

1611
01:08:57,416 --> 01:08:58,576
or AirPort Extreme.

1612
01:08:58,956 --> 01:09:01,936
So I'm going to go
ahead and do this.

1613
01:09:02,406 --> 01:09:04,156
First let me go to
my, back to my Mac.

1614
01:09:04,156 --> 01:09:07,106
I'm going to open up the
simplest of programs text edit

1615
01:09:07,106 --> 01:09:08,876
which is what we used
earlier and I'm going

1616
01:09:08,876 --> 01:09:10,526
to just make a very
simple web page.

1617
01:09:10,866 --> 01:09:14,526
First, I'm going
to do DOCTYPE html.

1618
01:09:14,526 --> 01:09:18,026
So in the course, we'll use html
5 which is sort of the latest

1619
01:09:18,026 --> 01:09:25,136
and greatest version of html
and I'm going to say "Hello."

1620
01:09:25,386 --> 01:09:26,066
Well, let's do this.

1621
01:09:26,726 --> 01:09:34,246
Let's call this Google and
the body and then close body

1622
01:09:34,316 --> 01:09:38,606
and then I'm going to
say Google, h1, OK,

1623
01:09:38,786 --> 01:09:40,216
I'm going to save this.

1624
01:09:40,216 --> 01:09:41,796
I'm just going to
go ahead and save it

1625
01:09:41,796 --> 01:09:44,376
on my desktop as Google.html.

1626
01:09:44,376 --> 01:09:47,766
I'm going to say yes use html
even though it's a text file

1627
01:09:48,056 --> 01:09:51,066
and now I'm going to go ahead
and pull up Google.html.

1628
01:09:52,206 --> 01:09:55,736
OK. So not exactly Google just
yet, I can do a little better

1629
01:09:55,736 --> 01:09:57,136
so let's make this
a little different.

1630
01:09:57,136 --> 01:10:03,036
So let me go in here, div
style, text-align: center.

1631
01:10:04,406 --> 01:10:06,906
So if any of these looks
completely cryptic or new,

1632
01:10:06,906 --> 01:10:08,866
these are the kinds of things
that we will take for granted

1633
01:10:08,866 --> 01:10:10,546
in the course, that this
stuff looks familiar.

1634
01:10:10,826 --> 01:10:11,896
So let me save and reload.

1635
01:10:11,896 --> 01:10:13,866
OK. Now, it looks a
little more like Google

1636
01:10:13,866 --> 01:10:16,956
but it's certainly lacking in
some key features among them.

1637
01:10:18,076 --> 01:10:19,396
All right, search bar, right?

1638
01:10:19,566 --> 01:10:20,366
So let's go there.

1639
01:10:20,366 --> 01:10:23,566
So let me start to make a simple
webpage that again odds are many

1640
01:10:23,566 --> 01:10:25,806
of you could have done
already because you know HTML

1641
01:10:25,806 --> 01:10:26,856
and forms and whatnot.

1642
01:10:26,856 --> 01:10:28,146
So, let's go ahead and do this.

1643
01:10:28,146 --> 01:10:30,746
So down here, I'm
going to say form,

1644
01:10:30,926 --> 01:10:34,026
I'm going to close my form,
I'm going to go in here

1645
01:10:34,696 --> 01:10:37,686
and I need an input
type equals text

1646
01:10:38,826 --> 01:10:41,056
and I probably need
a submit button,

1647
01:10:41,126 --> 01:10:45,146
so let me do input
type equal submit

1648
01:10:45,146 --> 01:10:47,886
and I'll give this a
value of Google search,

1649
01:10:47,886 --> 01:10:51,106
what we really recreated as
best we can, save and reload.

1650
01:10:51,106 --> 01:10:52,606
OK. So now, it's getting there.

1651
01:10:52,866 --> 01:10:54,596
You know, this isn't
the prettiest thing,

1652
01:10:54,596 --> 01:11:01,506
so let me go ahead and do style
with let's say 200 pixels.

1653
01:11:01,936 --> 01:11:03,396
Let's go back here and save

1654
01:11:03,396 --> 01:11:05,706
and it's still a little
small, 300 pixels.

1655
01:11:05,986 --> 01:11:07,366
OK, looks more Google like.

1656
01:11:07,366 --> 01:11:11,736
And then if we really want to
be anal here, we could do this.

1657
01:11:13,296 --> 01:11:16,206
I'm feeling lucky, OK.

1658
01:11:16,506 --> 01:11:18,496
So now we've got roughly
Google but in black in white.

1659
01:11:18,956 --> 01:11:21,126
So unfortunately, it
takes way more work

1660
01:11:21,126 --> 01:11:22,906
to implement the
backend of this website.

1661
01:11:22,976 --> 01:11:24,216
Right? So front end,
pretty easy.

1662
01:11:24,216 --> 01:11:26,136
We're pretty much done
other than some colors

1663
01:11:26,136 --> 01:11:27,516
and some other features
these days.

1664
01:11:27,896 --> 01:11:28,846
But what about the backend?

1665
01:11:29,096 --> 01:11:31,276
So if I actually wanted
to patch in to Google,

1666
01:11:31,276 --> 01:11:32,586
let's see if we can now revisit

1667
01:11:32,586 --> 01:11:34,186
that conversation
we started earlier.

1668
01:11:34,186 --> 01:11:36,616
When I type www.google.com
and hit enter,

1669
01:11:36,996 --> 01:11:38,036
what really is happening?

1670
01:11:38,036 --> 01:11:39,346
Let's take a look
underneath the hood.

1671
01:11:39,346 --> 01:11:41,346
Let's look at the
HTTP traffic and think

1672
01:11:41,346 --> 01:11:43,416
about what it is we're going
to start building next week

1673
01:11:43,416 --> 01:11:44,906
in terms of the actual backend.

1674
01:11:45,166 --> 01:11:47,956
So let's suspend
this mental thread,

1675
01:11:48,356 --> 01:11:53,446
pull up the actual google.com
and take a look at what is here.

1676
01:11:53,696 --> 01:11:58,566
I'm going to ahead first so
that this is enlightening.

1677
01:11:58,696 --> 01:12:02,636
They hid it so we have

1678
01:12:02,636 --> 01:12:04,736
to disable this annoying
instant search feature.

1679
01:12:06,226 --> 01:12:07,526
Let's select this.

1680
01:12:08,686 --> 01:12:12,116
Gear icon, search settings,

1681
01:12:14,596 --> 01:12:17,836
Google instants, how
do I disable you.

1682
01:12:18,716 --> 01:12:20,366
Never show instant results.

1683
01:12:20,466 --> 01:12:22,636
So the reason I want to do
this is we're not going to talk

1684
01:12:22,636 --> 01:12:24,796
about JavaScript and
AJAX, the technologies

1685
01:12:24,796 --> 01:12:26,686
that underlie this annoying

1686
01:12:26,686 --> 01:12:28,516
or beneficial instant
search feature.

1687
01:12:28,516 --> 01:12:31,376
We want to do sort of old
school HTTP searches right now.

1688
01:12:31,646 --> 01:12:32,686
So I've disabled that.

1689
01:12:33,126 --> 01:12:34,976
So now hopefully, I can save.

1690
01:12:35,466 --> 01:12:38,096
OK. Now, I'm going to
go back to google.com

1691
01:12:38,096 --> 01:12:39,866
and this is what it was like
five years ago when you wanted

1692
01:12:39,866 --> 01:12:40,996
to search for something
on the internet.

1693
01:12:41,366 --> 01:12:44,246
So now, I'm going to go ahead
and type for instance Harvard.

1694
01:12:44,506 --> 01:12:46,436
So it's still doing
auto complete

1695
01:12:46,436 --> 01:12:48,316
but it's not immediately
showing me the search results.

1696
01:12:48,546 --> 01:12:53,086
So now, notice before, here
is URL I'm at, www.google.com.

1697
01:12:53,326 --> 01:12:56,816
And now after, let me hit
enter, now notice the URL.

1698
01:12:57,206 --> 01:12:59,446
So this is now hinting

1699
01:12:59,446 --> 01:13:02,326
at the fundamental
functionality of HTTP.

1700
01:13:02,326 --> 01:13:05,276
We have just issued one
of those get requests.

1701
01:13:05,526 --> 01:13:06,636
We had two of them in fact.

1702
01:13:06,746 --> 01:13:09,546
The first one came up when I
visited the homepage then I hit

1703
01:13:09,546 --> 01:13:10,586
enter and it appears

1704
01:13:10,586 --> 01:13:12,396
that another get
request has been sent.

1705
01:13:12,816 --> 01:13:14,506
Why? Because my URL changed.

1706
01:13:14,696 --> 01:13:17,836
So generally, anytime the
keyword get is involved,

1707
01:13:17,936 --> 01:13:20,276
it's because the URL is
changing or equivalently

1708
01:13:20,276 --> 01:13:22,576
if the URL changes, you
just did a get most likely.

1709
01:13:22,976 --> 01:13:25,096
So there is a whole lot of
distracting stuff up there

1710
01:13:25,096 --> 01:13:28,466
but what is relevant
and what looks familiar

1711
01:13:28,466 --> 01:13:33,176
up there in the URL in gray?

1712
01:13:33,396 --> 01:13:34,786
I have no idea what HL is.

1713
01:13:35,866 --> 01:13:36,876
Sites, I don't know.

1714
01:13:36,876 --> 01:13:37,696
Source, I don't know.

1715
01:13:37,696 --> 01:13:38,536
But what looks familiar?

1716
01:13:39,266 --> 01:13:40,026
OK. Harvard.

1717
01:13:40,186 --> 01:13:42,636
So let me delete
manually all the stuff

1718
01:13:42,636 --> 01:13:45,486
that I have no idea what
it means, at least not yet.

1719
01:13:45,486 --> 01:13:47,136
So let me, I don't
know what this is.

1720
01:13:47,536 --> 01:13:49,166
Q equals Harvard that I--

1721
01:13:49,386 --> 01:13:51,936
oq equals Harvard,
oq, I don't know.

1722
01:13:51,936 --> 01:13:54,996
I'm just going to presumptuously
whittle it down to that

1723
01:13:54,996 --> 01:13:55,916
and now let's hit enter.

1724
01:13:56,316 --> 01:13:59,116
So interestingly, still
works and what's nice is

1725
01:13:59,116 --> 01:14:00,306
that there is much
less distraction

1726
01:14:00,306 --> 01:14:01,666
and we can have the same story

1727
01:14:01,736 --> 01:14:03,896
but with fewer distractions
in the tail.

1728
01:14:04,276 --> 01:14:08,036
So it looks like when hitting
enter on the previous page,

1729
01:14:08,036 --> 01:14:10,276
if I throw away the
distractions,

1730
01:14:10,886 --> 01:14:14,386
I have now visited, not
slash but slash search,

1731
01:14:14,756 --> 01:14:16,896
question mark, q equals Harvard.

1732
01:14:17,016 --> 01:14:17,816
So what is q?

1733
01:14:17,816 --> 01:14:20,316
Q is generally known
as an HTTP parameter.

1734
01:14:20,456 --> 01:14:22,176
So it is an input
to a web server

1735
01:14:22,336 --> 01:14:23,796
that generally comes
from a form.

1736
01:14:23,796 --> 01:14:24,866
But as we'll see in a few weeks,

1737
01:14:24,866 --> 01:14:26,256
it can also come
from JavaScript code.

1738
01:14:26,256 --> 01:14:27,866
It doesn't have to come
from a form per se.

1739
01:14:28,106 --> 01:14:29,786
Harvard is obviously
what I typed in.

1740
01:14:29,976 --> 01:14:31,176
So what is slash search?

1741
01:14:31,316 --> 01:14:34,576
Well, it's not obvious here what
programming language Google uses

1742
01:14:34,576 --> 01:14:36,986
but if we were on Facebook, we
would probably see search.php

1743
01:14:36,986 --> 01:14:40,126
because Facebook is
known for using PHP.

1744
01:14:40,126 --> 01:14:42,526
They're also known for not
hiding their file extensions,

1745
01:14:42,526 --> 01:14:44,946
which is very easy to do but
they just don't for some reason.

1746
01:14:45,236 --> 01:14:47,726
Google does hide their
file extension but a lot

1747
01:14:47,726 --> 01:14:49,856
of Google's code, at least
front end is written apparently

1748
01:14:49,856 --> 01:14:52,376
in Python or in some
other languages.

1749
01:14:52,376 --> 01:14:55,016
So it's not clear what
language is on the server

1750
01:14:55,296 --> 01:14:58,546
but slash search is
referring to some file

1751
01:14:58,546 --> 01:15:00,286
or some folder on the server.

1752
01:15:00,556 --> 01:15:01,836
What does the question
mark denote?

1753
01:15:02,286 --> 01:15:05,746
What's that?

1754
01:15:05,936 --> 01:15:06,126
>> Start.

1755
01:15:06,456 --> 01:15:07,746
>> The start of the parameters.

1756
01:15:07,886 --> 01:15:10,436
So anytime you have a
question mark in the URL,

1757
01:15:10,436 --> 01:15:13,606
that demarks the path and
the preceding part of the URL

1758
01:15:13,886 --> 01:15:16,986
from all of the parameters and
parameters are key value pairs,

1759
01:15:16,986 --> 01:15:18,076
something equals something.

1760
01:15:18,326 --> 01:15:20,656
And if you have multiple
parameters, what separates them,

1761
01:15:20,936 --> 01:15:22,596
even though I already
deleted the others.

1762
01:15:22,656 --> 01:15:22,723
Yeah?

1763
01:15:23,106 --> 01:15:23,206
>> And.

1764
01:15:24,536 --> 01:15:28,146
>> The and, the ampersand
symbol.

1765
01:15:28,346 --> 01:15:30,586
So if I hadn't deleted
all of that,

1766
01:15:30,816 --> 01:15:33,536
recall that we saw something
like this just a moment ago

1767
01:15:33,536 --> 01:15:35,746
and oq equals Harvard.

1768
01:15:35,746 --> 01:15:37,046
And I don't know what oq is

1769
01:15:37,046 --> 01:15:39,396
but that's how you would
separate parameters,

1770
01:15:39,396 --> 01:15:40,146
with ampersands.

1771
01:15:40,466 --> 01:15:45,386
So this means we have
submitted key of q and a value

1772
01:15:45,526 --> 01:15:46,646
of Harvard to the server.

1773
01:15:46,976 --> 01:15:49,946
So now, let's use a fairly
common tool built into Chrome.

1774
01:15:49,946 --> 01:15:51,326
It's also built into Safari.

1775
01:15:51,666 --> 01:15:53,096
Firefox has something similar

1776
01:15:53,096 --> 01:15:54,846
when we recommend
something called Firebug

1777
01:15:54,846 --> 01:15:56,966
on the course's website
with which

1778
01:15:56,966 --> 01:15:58,106
to do the same kind of thing.

1779
01:15:58,416 --> 01:16:02,366
But I'm going to go to View,
Developer, and Developer Tools.

1780
01:16:02,366 --> 01:16:05,126
And I will say these days
certainly when using LAMP,

1781
01:16:05,326 --> 01:16:09,286
Linux, Apache, MySQL and PHP,
which is this course's focus,

1782
01:16:09,586 --> 01:16:12,366
many people are increasingly
using Chrome, one,

1783
01:16:12,366 --> 01:16:14,496
because it's popular, two,
because it's fast, three,

1784
01:16:14,496 --> 01:16:16,166
because it comes with
some developer tools.

1785
01:16:16,506 --> 01:16:19,306
I would say Firefox is
also wonderfully convenient

1786
01:16:19,306 --> 01:16:21,296
for doing development and
you should certainly test

1787
01:16:21,476 --> 01:16:24,596
on multiple browsers as
we'll require in one of the--

1788
01:16:24,596 --> 01:16:27,586
in the first project
spec. You can do Window--

1789
01:16:27,586 --> 01:16:29,286
rather Internet Explorer
is getting better

1790
01:16:29,286 --> 01:16:31,106
about having some
integrated development tools.

1791
01:16:31,106 --> 01:16:33,766
From the courses perspective, we
don't care about browser you use

1792
01:16:33,766 --> 01:16:36,106
because you'll be using again
the appliance as a server.

1793
01:16:36,326 --> 01:16:38,426
You can use whatever browser,
whatever operating system

1794
01:16:38,426 --> 01:16:40,786
on your own computer that
you're most comfortable with.

1795
01:16:40,876 --> 01:16:44,866
But if you are coming
to this with some,

1796
01:16:45,116 --> 01:16:49,686
I'd say less familiarity
with various tools,

1797
01:16:49,966 --> 01:16:52,286
Chrome is pretty popular
and Firefox tend to be,

1798
01:16:52,286 --> 01:16:54,516
I think better for development
purposes even though you should

1799
01:16:54,516 --> 01:16:55,546
test on all of them.

1800
01:16:55,876 --> 01:16:56,856
So what am I seeing here?

1801
01:16:56,856 --> 01:17:00,706
I've just opened the developer
tabs, and now I have elements,

1802
01:17:00,786 --> 01:17:02,646
resources, networks,
scripts, timeline,

1803
01:17:02,646 --> 01:17:03,946
profiles, audits and console.

1804
01:17:03,946 --> 01:17:05,336
We're not going to use
all of these but a few

1805
01:17:05,336 --> 01:17:06,666
of them are quite helpful, one,

1806
01:17:06,986 --> 01:17:09,506
the elements tab shows
you the pages HTML

1807
01:17:09,736 --> 01:17:12,266
but it pretty-prints it for you
and it makes it hierarchical

1808
01:17:12,266 --> 01:17:14,456
so that with those little
triangles, you can dive

1809
01:17:14,456 --> 01:17:17,726
in deeper and see even though
if we look at view source

1810
01:17:17,726 --> 01:17:20,906
for the page, it is an
utter mess of a page.

1811
01:17:20,906 --> 01:17:26,206
If I go over here
and view page source,

1812
01:17:26,656 --> 01:17:28,076
this is what came
back from the server.

1813
01:17:28,746 --> 01:17:31,786
And I would argue this is
not very readable to a human,

1814
01:17:33,006 --> 01:17:35,146
even when we get
to the actual HTML,

1815
01:17:35,566 --> 01:17:37,406
even the HTML not that readable.

1816
01:17:37,466 --> 01:17:39,406
Color coded maybe,
still not useful.

1817
01:17:39,686 --> 01:17:40,546
So what does this do?

1818
01:17:40,686 --> 01:17:43,356
It actually-- The developer
toolbar actually parses it

1819
01:17:43,356 --> 01:17:44,736
for you so you can
start to navigate.

1820
01:17:44,836 --> 01:17:47,026
And this is actually wonderfully
compelling whether it's your

1821
01:17:47,026 --> 01:17:48,046
site or someone else is.

1822
01:17:48,046 --> 01:17:49,896
If it's someone else
is, it's a wonderful way

1823
01:17:49,896 --> 01:17:51,466
of learning how they
did something

1824
01:17:51,466 --> 01:17:52,786
or how they stylize something.

1825
01:17:52,786 --> 01:17:53,506
If it's your own site,

1826
01:17:53,686 --> 01:17:55,376
it's a wonderful way
of chasing down bugs.

1827
01:17:55,626 --> 01:17:58,816
And also as you'll see,
changing on the fly some

1828
01:17:58,816 --> 01:18:01,696
of the aesthetics without
having to change actual files

1829
01:18:01,696 --> 01:18:03,426
and then reload or re-upload.

1830
01:18:03,736 --> 01:18:05,936
So, we don't care so much
about elements right now

1831
01:18:06,066 --> 01:18:07,316
but we do care about network.

1832
01:18:07,526 --> 01:18:09,246
So let me go to the network tab.

1833
01:18:09,466 --> 01:18:14,546
And what this tab will
do for us is sniff all

1834
01:18:14,546 --> 01:18:17,406
of the network traffic between
my browser and the server

1835
01:18:17,496 --> 01:18:20,066
and it will show each
HTTP request one per line

1836
01:18:20,066 --> 01:18:20,516
at the bottom.

1837
01:18:20,516 --> 01:18:22,076
So I'm going to leave
this window open,

1838
01:18:22,076 --> 01:18:23,146
I'm going to click reload.

1839
01:18:23,146 --> 01:18:25,106
And again, this is my URL.

1840
01:18:26,296 --> 01:18:28,126
And here we go.

1841
01:18:28,356 --> 01:18:28,956
That's a lot.

1842
01:18:29,186 --> 01:18:31,156
I only hit reload once
but why in the world

1843
01:18:31,156 --> 01:18:33,366
that so many rows
appear down here?

1844
01:18:38,196 --> 01:18:41,246
I clicked once, but look how
much stuff just happened.

1845
01:18:41,476 --> 01:18:45,566
Why? Each of those again
represents an HTTP request.

1846
01:18:45,706 --> 01:18:48,426
So a virtual envelope from
browser to server and back.

1847
01:18:48,526 --> 01:18:48,636
Yeah?

1848
01:18:48,776 --> 01:18:50,516
>> Well there's a lot
going on behind the scenes.

1849
01:18:50,516 --> 01:18:51,756
>> What does that mean,
behind the scenes?

1850
01:18:51,966 --> 01:18:55,206
>> Well, Google doesn't just
have one method for [inaudible].

1851
01:18:55,386 --> 01:18:58,966
There's things that it has to
call, other things that it has

1852
01:18:58,966 --> 01:19:02,036
to find and bring up and so
that's all coming up in this--

1853
01:19:02,526 --> 01:19:03,226
>> OK, good.

1854
01:19:03,226 --> 01:19:04,656
So Google needs to
pull up other things.

1855
01:19:04,656 --> 01:19:07,016
Give me a concrete example of
something it has to pull up.

1856
01:19:07,016 --> 01:19:07,396
>> Like an image.

1857
01:19:07,646 --> 01:19:12,156
>> Good. So inside of the HTML
that's initially downloaded,

1858
01:19:12,156 --> 01:19:15,176
there could be an image tag, a
source tag, a link tag to CSS,

1859
01:19:15,176 --> 01:19:17,906
to JavaScript, to images,
it could be flash files,

1860
01:19:17,996 --> 01:19:20,216
it could be a whole bunch of
other assets, so to speak.

1861
01:19:20,456 --> 01:19:23,096
And to get those, the
browser is predefined to sort

1862
01:19:23,296 --> 01:19:25,136
of recursively go
get those assets.

1863
01:19:25,136 --> 01:19:27,506
So if it sees a source
tag or an image tag,

1864
01:19:27,686 --> 01:19:30,016
it will send another
virtual envelop requesting

1865
01:19:30,016 --> 01:19:31,336
that file specifically.

1866
01:19:31,556 --> 01:19:33,426
It might do it over the
same network connection,

1867
01:19:33,426 --> 01:19:35,236
the same TCP socket,
so to speak,

1868
01:19:35,446 --> 01:19:37,716
but each of these rows
represents a different file

1869
01:19:37,796 --> 01:19:38,596
that was downloaded.

1870
01:19:38,876 --> 01:19:42,256
Ironically, it seems that
Harvard is not behaving well

1871
01:19:42,256 --> 01:19:44,296
in the terms of the auto
previews but that's good.

1872
01:19:44,296 --> 01:19:45,596
In another day we
can look at why.

1873
01:19:46,056 --> 01:19:47,436
But let's look at the first one

1874
01:19:47,436 --> 01:19:49,206
because that's the one that'll
be the most enlightening

1875
01:19:49,206 --> 01:19:49,726
for now.

1876
01:19:50,066 --> 01:19:52,946
And when I click on this,
there's a few detail.

1877
01:19:52,946 --> 01:19:55,346
So, one, the preview is
just what was returned.

1878
01:19:55,516 --> 01:19:58,806
And here is another big mess
of results from the web server.

1879
01:19:58,806 --> 01:20:00,336
But we don't care
so much about that.

1880
01:20:00,336 --> 01:20:01,766
I care about the headers.

1881
01:20:02,046 --> 01:20:04,116
So let me zoom in on this.

1882
01:20:04,216 --> 01:20:06,866
And rather than look at this
fairly pretty-printed version

1883
01:20:06,866 --> 01:20:08,586
of it, I want to look
at the raw source.

1884
01:20:08,586 --> 01:20:11,526
So, we're diving in deep
sort of intellectually here

1885
01:20:11,526 --> 01:20:13,066
so let me look at view source.

1886
01:20:13,366 --> 01:20:17,926
Now this is what was literally
sent in that virtual envelope

1887
01:20:18,596 --> 01:20:20,656
that we started tonight's
discussion with.

1888
01:20:20,936 --> 01:20:22,576
So, there's the top line,

1889
01:20:22,576 --> 01:20:27,116
GET/search?q=harvard
HTTP/version number,

1890
01:20:27,116 --> 01:20:28,106
so that was in the envelope.

1891
01:20:28,426 --> 01:20:30,326
And we did promise there's
some other stuff in there.

1892
01:20:30,686 --> 01:20:32,956
Second line is a
reminder to the server

1893
01:20:33,146 --> 01:20:35,106
as to what the user typed in.

1894
01:20:35,106 --> 01:20:36,096
So what is the host name.

1895
01:20:36,336 --> 01:20:38,146
Now frankly Google is
not sharing their servers

1896
01:20:38,146 --> 01:20:39,486
with other companies most likely

1897
01:20:39,486 --> 01:20:41,186
so this doesn't really
matter there.

1898
01:20:41,416 --> 01:20:43,036
But for shared web
posting companies,

1899
01:20:43,086 --> 01:20:46,916
the fact that I'm being reminded
what the URL was means I can

1900
01:20:46,916 --> 01:20:49,606
serve up foo.com or
bar.com or bus.com,

1901
01:20:49,606 --> 01:20:51,466
so thankfully HTTP does that.

1902
01:20:51,886 --> 01:20:55,336
There's some arcane information
here related to caching

1903
01:20:55,336 --> 01:20:56,956
and connections and efficiency.

1904
01:20:57,186 --> 01:20:59,236
Well let me wave my
hand at that for now.

1905
01:20:59,236 --> 01:21:02,166
User-Agent is interesting and
you might know this already

1906
01:21:02,166 --> 01:21:05,866
but if you don't, every webpage
you've ever visited has--

1907
01:21:05,916 --> 01:21:08,976
every website you've ever
visited knows what computer you

1908
01:21:08,976 --> 01:21:10,906
have and what operating
system you're running

1909
01:21:10,906 --> 01:21:12,496
and what browser you were using.

1910
01:21:12,786 --> 01:21:13,346
Why is that?

1911
01:21:13,346 --> 01:21:16,746
Well browsers by default reveal
precisely that information.

1912
01:21:16,796 --> 01:21:19,006
I have just told
Google, behind the scenes

1913
01:21:19,006 --> 01:21:24,326
that I have a Mac running Mac
OS X 10.7.4 and if I scroll

1914
01:21:24,326 --> 01:21:26,196
down further, they
will be able to infer

1915
01:21:26,196 --> 01:21:28,526
that I was using Chrome
version something or rather.

1916
01:21:29,076 --> 01:21:32,096
So why in the world
is that useful?

1917
01:21:32,306 --> 01:21:32,386
Yeah?

1918
01:21:33,516 --> 01:21:42,556
[ Inaudible Remark ]

1919
01:21:43,056 --> 01:21:46,346
Good. So, arguably, this is
useful for debugging purposes,

1920
01:21:46,346 --> 01:21:49,596
useful for demographic purposes
to know who your users are.

1921
01:21:51,516 --> 01:21:55,616
[ Inaudible Remark ]

1922
01:21:56,116 --> 01:22:00,306
Good. So there are some
features that could be dictated

1923
01:22:00,306 --> 01:22:02,756
by what type of OS or
browser someone is using.

1924
01:22:02,756 --> 01:22:04,076
For instance, if
you go to a website

1925
01:22:04,076 --> 01:22:06,956
that lets you download software,
you know, it's not necessary

1926
01:22:06,956 --> 01:22:09,416
that you detect what the user's
operating system and browser are

1927
01:22:09,506 --> 01:22:11,186
but it's kind of a
nicer user experience

1928
01:22:11,266 --> 01:22:14,126
if the server only shows
you the Mac software

1929
01:22:14,366 --> 01:22:16,516
because you clearly have a
Mac as opposed to me having

1930
01:22:16,516 --> 01:22:17,866
to figure out which
of the links to click

1931
01:22:17,906 --> 01:22:20,736
for Linux or Windows or Mac OS.

1932
01:22:20,736 --> 01:22:23,416
Another argument frankly is that
this is completely unnecessary

1933
01:22:23,416 --> 01:22:24,236
and we should never have gotten

1934
01:22:24,236 --> 01:22:25,346
to this habit in
the first place.

1935
01:22:25,476 --> 01:22:28,896
Because if for-- it's not
necessarily used all that much.

1936
01:22:29,206 --> 01:22:31,046
And indeed, writing websites

1937
01:22:31,186 --> 01:22:34,556
that require knowing what
the user's browser is,

1938
01:22:34,556 --> 01:22:36,216
is actually generally
bad practice

1939
01:22:36,356 --> 01:22:38,206
because there will be
certain privacy tools

1940
01:22:38,516 --> 01:22:39,896
that users can install
in their computer

1941
01:22:39,896 --> 01:22:41,946
that just hide this
information altogether

1942
01:22:42,146 --> 01:22:43,136
for better or for worse.

1943
01:22:43,136 --> 01:22:45,536
And if you're relying on
certain headers to be sent,

1944
01:22:45,776 --> 01:22:47,396
your own website
could misbehave.

1945
01:22:47,736 --> 01:22:49,426
So there are-- it turns
out there's other tricks

1946
01:22:49,426 --> 01:22:51,776
for doing detection and
typically as we'll see

1947
01:22:51,776 --> 01:22:53,466
in JavaScript, it's
better generally

1948
01:22:53,466 --> 01:22:56,986
to detect whether a browser
has a certain feature rather

1949
01:22:56,986 --> 01:22:59,206
than is it a specific
operating system

1950
01:22:59,206 --> 01:23:00,856
or a specific version
of a browser.

1951
01:23:01,026 --> 01:23:03,996
However, databases freely
available exist that allow you

1952
01:23:03,996 --> 01:23:07,726
to figure out based on the
so called user agent strings,

1953
01:23:08,316 --> 01:23:10,996
what version of browser

1954
01:23:10,996 --> 01:23:12,116
and operating system
someone's using.

1955
01:23:12,116 --> 01:23:13,666
Because frankly this is
a little hard to read,

1956
01:23:13,666 --> 01:23:15,766
so software exist
that simplifies this.

1957
01:23:15,766 --> 01:23:18,816
So you can just check a boolean
variable is Mac or in PC.

1958
01:23:18,996 --> 01:23:20,276
All right, what's below?

1959
01:23:20,566 --> 01:23:22,776
Some more arcane details
that I'll wave my hand at.

1960
01:23:23,036 --> 01:23:24,946
Cookies we'll come back
to in a week or two

1961
01:23:24,946 --> 01:23:26,856
when we actually start
using them to our advantage

1962
01:23:26,856 --> 01:23:29,206
but we'll also talk about the
security implications of them.

1963
01:23:29,486 --> 01:23:31,456
But in a nutshell,
all of these headers,

1964
01:23:31,456 --> 01:23:35,146
just text is what was inside
of that virtual envelope.

1965
01:23:35,146 --> 01:23:37,606
And the most important one
arguably was the very first one

1966
01:23:37,606 --> 01:23:39,076
because that tells
Google what to return.

1967
01:23:39,426 --> 01:23:42,476
But now we see that it's not
just slash, it's a full path.

1968
01:23:43,016 --> 01:23:47,266
So Google has hopefully parsed
that string, so to speak, slash,

1969
01:23:47,266 --> 01:23:49,186
search, question mark,
q equals Harvard,

1970
01:23:49,416 --> 01:23:53,216
and then use the q equals
Harvard as input to its database

1971
01:23:53,216 --> 01:23:56,196
or whatnot to return
customized results to me.

1972
01:23:56,726 --> 01:23:59,026
Now if we scroll down, let's
see how Google replied.

1973
01:23:59,656 --> 01:24:01,446
So this is just a Chrome thing.

1974
01:24:01,696 --> 01:24:03,476
It is just kind of
dumbed down display

1975
01:24:03,476 --> 01:24:05,326
of the query string
parameter, so it's just useful.

1976
01:24:05,326 --> 01:24:07,366
Especially for a developer you
can just see it more easily

1977
01:24:07,366 --> 01:24:07,806
this way.

1978
01:24:08,196 --> 01:24:10,786
But let me go ahead and view
source now for response headers.

1979
01:24:11,116 --> 01:24:12,876
This is what the
server responded with.

1980
01:24:13,286 --> 01:24:17,876
So it turns out many of you have
seen numbers return by server.

1981
01:24:17,876 --> 01:24:20,236
Who has ever seen the
message 404 come back?

1982
01:24:20,886 --> 01:24:21,916
OK. What does 404 mean?

1983
01:24:22,016 --> 01:24:23,066
[ Inaudible Remark ]

1984
01:24:23,066 --> 01:24:25,526
File not found, right.

1985
01:24:25,606 --> 01:24:27,606
So it's an HTTP status code.

1986
01:24:27,606 --> 01:24:30,016
It's an arbitrary number the
world decided on years ago

1987
01:24:30,016 --> 01:24:31,416
that means file is not found.

1988
01:24:31,416 --> 01:24:34,326
What are some others you might
have seen besides this one?

1989
01:24:35,016 --> 01:24:36,046
[ Inaudible Remark ]

1990
01:24:36,046 --> 01:24:38,796
501, so internal server
error of some sort.

1991
01:24:39,266 --> 01:24:44,196
503, it's another internal
error or resource forbidden.

1992
01:24:44,576 --> 01:24:48,176
There's 403 rather
which is forbidden.

1993
01:24:48,966 --> 01:24:49,696
What's that?

1994
01:24:49,696 --> 01:24:51,276
>> 301 and 302.

1995
01:24:51,346 --> 01:24:54,616
>> 301 and 302 are redirects
which are actually quite useful

1996
01:24:54,616 --> 01:24:57,066
and we'll start using those
in the next lecture or two.

1997
01:24:57,206 --> 01:24:59,196
So in short, there's some codes
that you've probably seen.

1998
01:24:59,196 --> 01:25:00,636
404 is maybe the most popular.

1999
01:25:00,866 --> 01:25:03,256
200 you might not have ever
seen, but this is the best one

2000
01:25:03,256 --> 01:25:07,346
of all, 200 is literally OK, it
means everything worked out well

2001
01:25:07,346 --> 01:25:09,826
so you just don't see it
because it indicates success

2002
01:25:09,826 --> 01:25:11,336
as this little green icon

2003
01:25:11,336 --> 01:25:14,096
that we saw a moment ago
before I expanded this.

2004
01:25:14,456 --> 01:25:16,546
So, this is the server's
response, 200 means,

2005
01:25:16,676 --> 01:25:18,196
found what you're
looking for, here it is.

2006
01:25:18,556 --> 01:25:19,726
Now, what else comes down?

2007
01:25:20,236 --> 01:25:24,686
We have the date from the server
which might be useful, expires

2008
01:25:24,686 --> 01:25:27,686
and cache control, so directives
to the browser saying, doer,

2009
01:25:27,686 --> 01:25:29,866
don't cache this, even though
these are not necessarily

2010
01:25:29,866 --> 01:25:31,986
reliable, but we'll talk
about this when we get

2011
01:25:31,986 --> 01:25:33,806
to PHP, this is interesting.

2012
01:25:33,876 --> 01:25:37,156
Set-Cookie, Set-Cookie is
amazingly powerful, if not,

2013
01:25:37,156 --> 01:25:39,536
a little unsettling especially
in the world of advertising

2014
01:25:39,536 --> 01:25:40,906
and tracking, but
we'll talk about that

2015
01:25:40,906 --> 01:25:42,496
in the context of PHP.

2016
01:25:42,496 --> 01:25:45,056
Notice that the server is
telling us that it supports gzip

2017
01:25:45,316 --> 01:25:47,326
which is like a compression
utility,

2018
01:25:47,626 --> 01:25:50,046
which is a compression utility
and this just means, hey,

2019
01:25:50,176 --> 01:25:53,806
you can compress your
data to and from me.

2020
01:25:53,806 --> 01:25:57,556
The name of the server, gws,
probably Google web services

2021
01:25:57,716 --> 01:25:59,366
and then some headers
that they use for some

2022
01:25:59,366 --> 01:26:00,526
of the security things
we'll talk

2023
01:26:00,526 --> 01:26:02,066
about later in the semester.

2024
01:26:02,066 --> 01:26:04,466
So that's what Google
has returned in addition

2025
01:26:04,466 --> 01:26:07,256
to the content that has
come back from the server.

2026
01:26:07,596 --> 01:26:09,786
So let's see this outside
the scope of our browser.

2027
01:26:09,786 --> 01:26:13,146
I'm going to go on and open
a program called Terminal

2028
01:26:13,746 --> 01:26:15,676
which comes with Mac
OS, for those of you

2029
01:26:15,676 --> 01:26:18,326
with Windows PuTTY is another
option and we'll look at that

2030
01:26:18,776 --> 01:26:21,816
or encourage that in-- my
music is still playing.

2031
01:26:22,366 --> 01:26:24,986
We'll look at that,
we'll recommend

2032
01:26:24,986 --> 01:26:26,376
that for future project
and I'm going

2033
01:26:26,376 --> 01:26:27,576
to run a program called telnet,

2034
01:26:27,686 --> 01:26:29,936
telnet is like SSH
but unencrypted.

2035
01:26:29,936 --> 01:26:31,606
You know, that's bit of
an over simplification.

2036
01:26:31,876 --> 01:26:34,166
I'm going to go head
and telnet to google.com

2037
01:26:34,546 --> 01:26:36,316
and nothing actually happens.

2038
01:26:36,426 --> 01:26:38,136
But it did figure out
Google's IP address,

2039
01:26:38,246 --> 01:26:39,276
so that's interesting.

2040
01:26:39,726 --> 01:26:46,036
But telnet by default uses port
20-- it's been so long, 21,

2041
01:26:46,486 --> 01:26:49,136
tenet uses port 21, TCP port 21

2042
01:26:49,516 --> 01:26:52,316
but there's no telnet server
there, telnet use to be

2043
01:26:52,316 --> 01:26:54,706
to send messages and connect
to email serves and the like.

2044
01:26:54,706 --> 01:26:57,626
But what if I instead say
80, so there's no colon

2045
01:26:57,626 --> 01:26:59,206
in this program, there
is in the browser.

2046
01:26:59,386 --> 01:27:01,356
But this is going to
connect from my laptop

2047
01:27:01,356 --> 01:27:03,266
to google.com on TCP port 80.

2048
01:27:03,396 --> 01:27:05,156
So this is interesting.

2049
01:27:05,466 --> 01:27:08,076
Now, I've connected
to their server, why?

2050
01:27:08,226 --> 01:27:09,206
Or how do I know that?

2051
01:27:09,286 --> 01:27:12,726
It's telling me, connected
to www.l.google.com.

2052
01:27:13,066 --> 01:27:14,146
Where did this l come from?

2053
01:27:14,366 --> 01:27:16,036
They're doing some DNS
trickery, it's probably

2054
01:27:16,036 --> 01:27:18,466
for load balancing purposes,
they have multiple servers,

2055
01:27:18,466 --> 01:27:20,606
so therefore, I've gotten
one of them, specifically.

2056
01:27:20,606 --> 01:27:22,336
But I'm going to
pretend to be a browser.

2057
01:27:22,336 --> 01:27:26,036
I'm going to say, get me
slash using HTTP version 1.1

2058
01:27:26,266 --> 01:27:27,266
and then hit enter.

2059
01:27:27,416 --> 01:27:29,116
And then if that's the
last of my headers,

2060
01:27:29,116 --> 01:27:32,886
I have to hit enter twice,
and viola, what do I see?

2061
01:27:33,226 --> 01:27:36,036
Well the font is kind of big
and HTML and Java Script is kind

2062
01:27:36,036 --> 01:27:39,136
of minified but that's exactly
what my browser got back.

2063
01:27:39,136 --> 01:27:41,176
But if I keep going up
and up and up and up,

2064
01:27:41,486 --> 01:27:45,486
notice I can see exactly what
the server's response was.

2065
01:27:45,896 --> 01:27:48,566
So I see my HTTP headers that
came back from the server,

2066
01:27:49,146 --> 01:27:50,856
Set-Cookie and all
those same lines,

2067
01:27:50,856 --> 01:27:52,306
exactly what I saw
on the browser.

2068
01:27:52,306 --> 01:27:54,796
So I've just pretended
really to be a browser.

2069
01:27:54,796 --> 01:27:57,126
And we can do this with
any websites and it's more

2070
01:27:57,126 --> 01:27:59,696
than just a curiosity, it can
actually help with debugging

2071
01:27:59,696 --> 01:28:01,986
or actually seeing what's
coming back from a server.

2072
01:28:02,266 --> 01:28:08,806
I can do www.harvard.edu 80.

2073
01:28:08,806 --> 01:28:11,236
GET/HTTP 1.1, enter, enter.

2074
01:28:11,886 --> 01:28:13,036
So, interesting.

2075
01:28:13,036 --> 01:28:13,696
Bad request.

2076
01:28:14,716 --> 01:28:15,476
Now why is that?

2077
01:28:15,666 --> 01:28:17,516
So we see some HTML,
because this--

2078
01:28:17,516 --> 01:28:18,656
the web server assumes

2079
01:28:18,656 --> 01:28:20,246
that a browser will
typically be doing this.

2080
01:28:20,876 --> 01:28:22,396
Why might this be a bad request?

2081
01:28:23,156 --> 01:28:27,576
I'm actually going
to guess here.

2082
01:28:27,576 --> 01:28:33,196
Let's try this, GET/HTTP
1.1, Host harvard.edu.

2083
01:28:33,726 --> 01:28:34,186
There we go.

2084
01:28:34,566 --> 01:28:38,466
So it didn't like the fact that
I did not send the host header

2085
01:28:38,726 --> 01:28:41,576
which means Harvard's web server
is probably using something

2086
01:28:41,576 --> 01:28:44,426
called virtual hosting, which
is that feature I alluded

2087
01:28:44,426 --> 01:28:47,136
to earlier when a
website can support--

2088
01:28:47,246 --> 01:28:50,476
when a web server can support
multiple websites but for

2089
01:28:50,616 --> 01:28:53,086
that to work, browsers
have to cooperate.

2090
01:28:53,376 --> 01:28:55,516
And the fact that I did
not send that header meant

2091
01:28:55,516 --> 01:28:57,456
that the server didn't know
who's home page to return

2092
01:28:57,726 --> 01:29:00,806
so it gave me that 400 response
of I don't know what to do.

2093
01:29:01,166 --> 01:29:03,096
Now let's try one other
thing, let me cancel this

2094
01:29:04,006 --> 01:29:08,366
and let me do, telnet
to not www.harvard.edu,

2095
01:29:08,366 --> 01:29:09,976
let's try this one,
just see what happens.

2096
01:29:10,446 --> 01:29:14,116
So GET/HTTP 1.1, enter, enter.

2097
01:29:14,116 --> 01:29:15,466
OK. It didn't like that.

2098
01:29:15,466 --> 01:29:16,496
So let's fix this again.

2099
01:29:16,996 --> 01:29:24,286
So, GET/HTTP 1.1, Host
harvard.edu, enter, enter.

2100
01:29:25,036 --> 01:29:25,536
Interesting.

2101
01:29:25,816 --> 01:29:26,816
This is not the home page?

2102
01:29:26,816 --> 01:29:29,806
What did I get this time?

2103
01:29:29,966 --> 01:29:31,926
Some message about it's moved,

2104
01:29:32,366 --> 01:29:34,616
harvard.edu has moved
permanently no less.

2105
01:29:34,806 --> 01:29:37,566
And if I scroll up,
more esoterically

2106
01:29:37,566 --> 01:29:39,716
in the headers is one,
a different status code,

2107
01:29:39,716 --> 01:29:43,336
301 which we mentioned earlier,
301 means permanent redirect.

2108
01:29:43,696 --> 01:29:45,436
If a browser receives a 301,

2109
01:29:45,956 --> 01:29:47,776
it should never ask
that question again.

2110
01:29:48,076 --> 01:29:50,636
It should just remember
harvard.edu moved,

2111
01:29:50,826 --> 01:29:51,836
and it moved where?

2112
01:29:52,006 --> 01:29:53,706
To the value of the
location field

2113
01:29:53,706 --> 01:29:56,096
which should also be included
in the response headers.

2114
01:29:56,096 --> 01:29:56,906
How did that happened?

2115
01:29:57,646 --> 01:29:59,786
Well some system
administrator or a dean

2116
01:29:59,786 --> 01:30:02,396
at Harvard just decided
arbitrarily that's--

2117
01:30:02,466 --> 01:30:04,676
but reasonably, that we
don't want to standardize

2118
01:30:04,676 --> 01:30:07,426
on harvard.edu in our
browsers and people's browsers.

2119
01:30:07,426 --> 01:30:12,636
We want them automatically to be
redirected to www.hardvard.edu.

2120
01:30:12,676 --> 01:30:15,916
Why? One, branding, I
mean, that's one reason,

2121
01:30:15,916 --> 01:30:17,336
which is perfectly reasonable.

2122
01:30:17,616 --> 01:30:20,146
Two, more technologically,
it can be better for,

2123
01:30:20,766 --> 01:30:22,336
security is a bit
of an overstatement,

2124
01:30:22,336 --> 01:30:23,726
but for technical reasons,

2125
01:30:23,726 --> 01:30:26,796
having the www means your
cookies can be isolated

2126
01:30:26,796 --> 01:30:31,816
to www.harvard.edu, whereas, if
your cookies were instead sent

2127
01:30:31,816 --> 01:30:34,946
to hardvard.edu, that means your
cookies could be read really

2128
01:30:34,946 --> 01:30:38,656
by any websites, so
including cs.harvard.edu

2129
01:30:38,656 --> 01:30:41,016
or summer.harvard.edu.

2130
01:30:41,276 --> 01:30:44,496
So by saying www, you're
also forcing at least

2131
01:30:44,496 --> 01:30:46,996
by default cookies to be
more precisely defined.

2132
01:30:46,996 --> 01:30:48,486
So there are some
technical reasons as well.

2133
01:30:48,796 --> 01:30:51,706
Only a year or two was
this problem fixed, they--

2134
01:30:51,706 --> 01:30:54,546
few years back, someone
new came to Harvard

2135
01:30:54,546 --> 01:30:56,426
to run the news office and
one of her first things,

2136
01:30:56,866 --> 01:31:00,316
one of her first acts was to fix
a horrific omission for years

2137
01:31:00,316 --> 01:31:01,836
where harvard.edu did not exist.

2138
01:31:02,316 --> 01:31:06,226
www.hardvard.edu existed and
they weren't even redirecting.

2139
01:31:06,276 --> 01:31:08,866
So, that is a bug
that's now been solved.

2140
01:31:09,486 --> 01:31:14,116
Any questions then on
what just happened there?

2141
01:31:14,116 --> 01:31:16,166
You all have the terminal
window open, let me offer

2142
01:31:16,166 --> 01:31:18,056
up some other troubleshooting
tips.

2143
01:31:18,186 --> 01:31:21,066
NS look up, name server
look up is a wonderful way

2144
01:31:21,236 --> 01:31:24,096
of doing those DNS look
ups we talked about before.

2145
01:31:24,336 --> 01:31:27,756
What I've just done is ask the
nearest DNS server which happens

2146
01:31:27,806 --> 01:31:30,056
to be this because that's how
Harvard has configured the

2147
01:31:30,056 --> 01:31:31,516
campus, that's the DNS server.

2148
01:31:31,896 --> 01:31:34,506
And I've asked what the
IP address of hardvard.edu

2149
01:31:34,586 --> 01:31:35,996
and it's given me
this IP address.

2150
01:31:36,416 --> 01:31:39,616
So if I want to get
curious, let me do this,

2151
01:31:39,616 --> 01:31:43,006
http://ipaddress, interesting.

2152
01:31:43,986 --> 01:31:45,376
Why this is not working?

2153
01:31:45,376 --> 01:31:46,516
Well, again, VHosting.

2154
01:31:46,696 --> 01:31:48,356
Like the website
is not configured

2155
01:31:48,356 --> 01:31:50,626
to understand IP
addresses by default.

2156
01:31:50,806 --> 01:31:51,906
However, let's try another one.

2157
01:31:51,976 --> 01:31:57,876
NS look up cnn.com-- oops,
cnn.com, well interesting.

2158
01:31:58,006 --> 01:31:59,326
So it turns out with DNS,

2159
01:31:59,326 --> 01:32:01,086
you can also do what's
called round robin.

2160
01:32:01,086 --> 01:32:04,126
You can return multiple IP
addresses for a web server

2161
01:32:04,126 --> 01:32:06,166
and those can rotate
literally in the order

2162
01:32:06,166 --> 01:32:08,996
in which they're returned to do
load balancing and we'll discuss

2163
01:32:09,026 --> 01:32:11,386
that topic again toward the end
of the semester in scalability.

2164
01:32:11,386 --> 01:32:12,366
But let me choose one of these.

2165
01:32:12,366 --> 01:32:14,166
And CNN either pretty big.

2166
01:32:14,166 --> 01:32:16,536
I am guessing they don't
really share with some other

2167
01:32:16,536 --> 01:32:17,276
with some other websites.

2168
01:32:17,466 --> 01:32:18,906
So let's just go
their IP address

2169
01:32:19,346 --> 01:32:20,906
and indeed, there it works.

2170
01:32:21,336 --> 01:32:23,416
And now notice my
URL hasn't changed.

2171
01:32:23,646 --> 01:32:27,586
So now, if I really
want to get sort of--

2172
01:32:27,716 --> 01:32:33,216
if I really want to get sort of
creative, I'm going to do this.

2173
01:32:33,506 --> 01:32:36,176
On my Mac and you can do this
on a Windows machine as well.

2174
01:32:36,496 --> 01:32:39,156
There's typically a file on
Macs and Linux computers,

2175
01:32:39,156 --> 01:32:41,936
it' called etc host which
a text file that maps,

2176
01:32:42,226 --> 01:32:46,126
that hard codes IP
addresses for domain names.

2177
01:32:46,486 --> 01:32:49,376
This is useful generally
for internal corporate use

2178
01:32:49,376 --> 01:32:50,736
or development purposes.

2179
01:32:50,736 --> 01:32:52,986
So we'll be able to do
this with projects as well.

2180
01:32:53,376 --> 01:32:56,386
I'm going to go ahead
and authenticate here.

2181
01:32:57,026 --> 01:33:00,726
It's just a text file and
notice this is some basic ones

2182
01:33:00,726 --> 01:33:01,746
that come with the system.

2183
01:33:01,786 --> 01:33:04,996
This is an IPV 6, version 6
address written in a crazy from.

2184
01:33:04,996 --> 01:33:08,416
I'm going to go ahead
and paste in not the URL

2185
01:33:08,676 --> 01:33:11,006
but the IP address
of CNN and I'm going

2186
01:33:11,006 --> 01:33:15,036
to say this is davidnews.com.

2187
01:33:15,396 --> 01:33:17,586
So this is like manually
overwriting,

2188
01:33:17,766 --> 01:33:20,516
the mapping of that IP
address to something else here,

2189
01:33:20,516 --> 01:33:21,696
only from my own computer.

2190
01:33:21,776 --> 01:33:23,036
I'm not running a DNS server.

2191
01:33:23,206 --> 01:33:25,716
It's just that my
Operating System Mac OS

2192
01:33:25,716 --> 01:33:27,656
and Windows is supposed
to look at a file

2193
01:33:27,656 --> 01:33:29,856
like this before
asking a DNS server.

2194
01:33:30,346 --> 01:33:32,276
So now let's see if this works.

2195
01:33:32,276 --> 01:33:34,226
It doesn't work with all
websites but let me go

2196
01:33:34,226 --> 01:33:38,966
to http://dividnews.com.

2197
01:33:41,896 --> 01:33:46,006
Come on. Oh, come on.

2198
01:33:47,166 --> 01:33:47,706
There it is.

2199
01:33:48,476 --> 01:33:50,256
I've just made my own news site.

2200
01:33:51,096 --> 01:33:53,136
So, frankly, this is
kind of stupid of them.

2201
01:33:53,226 --> 01:33:55,586
Like, I was just joking with
some friend the other day

2202
01:33:55,586 --> 01:33:56,866
that you could kind
of have fun with this

2203
01:33:56,866 --> 01:33:58,546
and make fairly offensive
domain names

2204
01:33:58,546 --> 01:34:00,546
and they all lead
to CNN somehow.

2205
01:34:00,546 --> 01:34:01,306
And why is this?

2206
01:34:01,626 --> 01:34:04,226
So, this is trivial defects
frankly in a web server.

2207
01:34:04,226 --> 01:34:07,216
A web server could be configured
as you will be able to do

2208
01:34:07,216 --> 01:34:11,156
with features of Apache before
long of checking upon receive

2209
01:34:11,156 --> 01:34:12,576
of one of those virtual
envelopes,

2210
01:34:12,576 --> 01:34:14,146
what was in two field?

2211
01:34:14,426 --> 01:34:16,586
If the two field does
not match something

2212
01:34:16,586 --> 01:34:18,766
that we're happy with,
redirect the user.

2213
01:34:18,766 --> 01:34:21,486
How? You respond with
what status code?

2214
01:34:22,656 --> 01:34:26,086
301. So it is actually trivial
to fix this kind of thing.

2215
01:34:26,086 --> 01:34:28,616
That could still lead-- They
can't stop davidnews.com

2216
01:34:28,616 --> 01:34:31,916
from leading to cnn.com but
they can stop the browser

2217
01:34:31,916 --> 01:34:35,256
from staying there or at least
by encouraging it with that 301

2218
01:34:35,256 --> 01:34:36,366
to redirect elsewhere.

2219
01:34:36,576 --> 01:34:39,766
And this redirection is super
common, not just for harvard.edu

2220
01:34:39,996 --> 01:34:41,336
but even the course's
own website.

2221
01:34:41,336 --> 01:34:46,636
If I go to http://cs75.net
and hit enter,

2222
01:34:46,946 --> 01:34:49,256
notice what the URL changes to.

2223
01:34:50,816 --> 01:34:52,706
A few things happened there.

2224
01:34:52,826 --> 01:34:54,026
So this is the course's website.

2225
01:34:54,026 --> 01:34:54,866
What are some the things

2226
01:34:54,866 --> 01:34:58,156
that got inserted
automatically it seems?

2227
01:34:58,856 --> 01:34:58,936
Yeah.

2228
01:34:59,836 --> 01:35:01,766
>> [Inaudible] secure version.

2229
01:35:01,766 --> 01:35:02,516
>> Indeed.

2230
01:35:02,516 --> 01:35:05,186
So, I didn't just go
to the www version.

2231
01:35:05,186 --> 01:35:07,166
I also went to the
secure version, why?

2232
01:35:07,536 --> 01:35:09,506
We've just gotten into the habit
and I personally have gotten

2233
01:35:09,506 --> 01:35:11,046
to the habit of using this it
as a result for everything.

2234
01:35:11,046 --> 01:35:12,416
It's relatively cheap to do.

2235
01:35:12,416 --> 01:35:15,556
It's relatively trivial to turn
on and it's only getting cheaper

2236
01:35:15,556 --> 01:35:16,896
as CPUs are getting faster.

2237
01:35:16,896 --> 01:35:19,146
And some of you might
be familiar with--

2238
01:35:19,146 --> 01:35:22,296
about a year and a half ago
a tool called Firesheep was

2239
01:35:22,296 --> 01:35:25,366
released which was a wonderfully
free proof of concept

2240
01:35:25,366 --> 01:35:27,656
of something called a
session hijacking attack.

2241
01:35:27,906 --> 01:35:29,476
Something we'll talk
about in a few weeks time

2242
01:35:29,476 --> 01:35:30,736
in the context of security.

2243
01:35:31,156 --> 01:35:33,326
Long story short, if you
are visiting a website

2244
01:35:33,546 --> 01:35:39,076
that uses http://, it is
fairly trivial for someone

2245
01:35:39,076 --> 01:35:41,586
in your nearby wireless
vicinity whether in this room,

2246
01:35:41,626 --> 01:35:45,906
at Starbucks, even in your home,
if you have adversarial siblings

2247
01:35:45,906 --> 01:35:49,696
or roommates to login
to your Facebook account

2248
01:35:49,696 --> 01:35:51,656
or your Google account
or-- actually not Google,

2249
01:35:51,756 --> 01:35:53,196
Facebook account
or Twitter account

2250
01:35:53,386 --> 01:35:55,276
or any websites that's
not using https.

2251
01:35:55,276 --> 01:35:58,576
And that is because if
you're not using https,

2252
01:35:59,106 --> 01:36:00,956
nothing is encrypted and
you probably knew that.

2253
01:36:01,586 --> 01:36:03,766
But among the things that
aren't encrypted are things

2254
01:36:03,766 --> 01:36:04,436
called cookies.

2255
01:36:04,876 --> 01:36:08,246
So if you are just broadcasting
cookies and cookies it turns

2256
01:36:08,246 --> 01:36:10,806
out as we'll see in a week
or two are the mechanism via

2257
01:36:10,806 --> 01:36:14,096
which users are remembered as
being logged in to websites.

2258
01:36:14,446 --> 01:36:16,796
If you're just sending that
cookie to a website again

2259
01:36:16,796 --> 01:36:19,206
and again to remind you, I'm
logged in, I'm logged in,

2260
01:36:19,206 --> 01:36:20,996
I'm logged in, that's
not encrypted.

2261
01:36:20,996 --> 01:36:23,326
Anyone in Starbucks
can sniff that cookie

2262
01:36:23,576 --> 01:36:26,026
and with the right technical
savvy, as you will soon have,

2263
01:36:26,296 --> 01:36:28,126
send it as their own
and now they're logged

2264
01:36:28,126 --> 01:36:28,966
in to whatever you were.

2265
01:36:28,966 --> 01:36:30,506
It doesn't mean they
know your password

2266
01:36:30,936 --> 01:36:34,106
but it does mean they can
hijack your current session

2267
01:36:34,106 --> 01:36:34,746
so to speak.

2268
01:36:35,076 --> 01:36:38,466
So I retracted Google because
Google about a year or two ago,

2269
01:36:38,466 --> 01:36:40,446
thanks to some of the
issues in china they had

2270
01:36:40,446 --> 01:36:41,856
with the hacking and whatnot.

2271
01:36:42,116 --> 01:36:44,206
They transitioned all
their services to https.

2272
01:36:44,206 --> 01:36:45,956
At least if you opt
into it Facebook,

2273
01:36:45,956 --> 01:36:47,366
also finally offers this.

2274
01:36:47,696 --> 01:36:48,836
But again, we'll
come back to this.

2275
01:36:48,876 --> 01:36:51,116
So, a few more weeks of
insecurity in your lives

2276
01:36:51,116 --> 01:36:53,346
if you don't mind but we'll
come back to this and talk

2277
01:36:53,346 --> 01:36:55,366
about how you can have
certain defenses up.

2278
01:36:55,366 --> 01:36:59,626
And then tragically, even
websites that redirect

2279
01:37:00,076 --> 01:37:02,506
from the unencrypted version

2280
01:37:02,506 --> 01:37:05,726
to the encrypted version might
still be vulnerable because many

2281
01:37:05,726 --> 01:37:08,026
of those websites will
first do a redirect

2282
01:37:08,356 --> 01:37:11,786
that sends your cookie in the
clear and then it realizes, "Oh.

2283
01:37:11,786 --> 01:37:12,816
This should be secure."

2284
01:37:13,056 --> 01:37:14,036
But by then it's too late.

2285
01:37:14,306 --> 01:37:16,506
So, even though banking
websites almost always use this.

2286
01:37:16,506 --> 01:37:18,876
There have been certain
banks to be known to be not

2287
01:37:18,876 --> 01:37:22,526
so technically savvy who
are still leaking cookies

2288
01:37:23,116 --> 01:37:25,286
for reasons that
we'll soon reveal.

2289
01:37:25,556 --> 01:37:29,136
So I inserted the www the https
and then the stupid main page,

2290
01:37:29,136 --> 01:37:31,186
which is a media wiki thing
which is the tool we used

2291
01:37:31,186 --> 01:37:32,056
for the course's website.

2292
01:37:32,056 --> 01:37:33,286
It's free wiki software

2293
01:37:33,526 --> 01:37:35,306
and that's not really
intellectually interesting,

2294
01:37:35,436 --> 01:37:37,216
just their own software thing.

2295
01:37:37,776 --> 01:37:38,246
All right.

2296
01:37:38,566 --> 01:37:40,246
Any questions on http?

2297
01:37:41,346 --> 01:37:42,876
Well delete davidnews.com.

2298
01:37:43,146 --> 01:37:45,786
And again, that wouldn't
work with all web sites.

2299
01:37:45,786 --> 01:37:47,996
And in fact, Harvard's
probably would not work for us.

2300
01:37:49,196 --> 01:37:49,656
All right.

2301
01:37:49,976 --> 01:37:52,366
So we've teased apart
these http headers.

2302
01:37:52,366 --> 01:37:54,966
They'll become an invaluable
resource when it comes time

2303
01:37:54,966 --> 01:37:57,236
to chase down bugs or
features in your own code.

2304
01:37:57,386 --> 01:37:59,186
Let me take another
quick look at something

2305
01:37:59,186 --> 01:38:03,386
within Chrome's developer
toolbar then we'll do one last

2306
01:38:03,576 --> 01:38:05,106
thing with regard
to Google and see

2307
01:38:05,106 --> 01:38:06,726
if we can implement
a little more

2308
01:38:06,726 --> 01:38:07,976
of our own version of Google.

2309
01:38:08,306 --> 01:38:12,926
So, let me go to the elements
tab and just as a proof

2310
01:38:12,926 --> 01:38:15,216
of concept here, Google's
website is a little complex

2311
01:38:15,216 --> 01:38:17,906
underneath the hood, but let
me go ahead and right click

2312
01:38:17,936 --> 01:38:21,296
on Harvard University and I'm
going to choose Inspect Element.

2313
01:38:21,936 --> 01:38:24,106
And Inspect Element is
nice because it's going

2314
01:38:24,106 --> 01:38:27,856
to jump me right to the part
of the html that relates

2315
01:38:27,856 --> 01:38:30,646
to that portion of the page
which is wonderfully useful

2316
01:38:30,646 --> 01:38:32,366
for diving in deeper
to a specific place.

2317
01:38:32,726 --> 01:38:36,656
So that is the A href that got
me that Harvard University link.

2318
01:38:36,936 --> 01:38:38,916
Now suppose, I'm
actually Google's designer

2319
01:38:38,916 --> 01:38:40,846
and we're not quite happy
with the shade of blue

2320
01:38:40,846 --> 01:38:43,226
or the font size or font
face and so I want to tinker

2321
01:38:43,226 --> 01:38:45,536
with the web site but frankly
I don't want to have to log

2322
01:38:45,536 --> 01:38:48,526
in to the server and change font
size then save the file then

2323
01:38:48,526 --> 01:38:50,966
reload the browser and go
through all these hoops.

2324
01:38:51,096 --> 01:38:55,246
I kind of like to do it
inline in the browser albeit

2325
01:38:55,246 --> 01:38:56,586
without saving any changes.

2326
01:38:56,906 --> 01:38:59,616
So notice here on the right, if
you've not used Chrome before,

2327
01:39:00,186 --> 01:39:01,986
that-- or the developer toolbar.

2328
01:39:02,226 --> 01:39:06,006
Notice on the right, you have
a summary of all the styles

2329
01:39:06,036 --> 01:39:09,306
that relate to that specific
element in the web page.

2330
01:39:09,586 --> 01:39:14,246
So from top to bottom,
here's the C in CSS,

2331
01:39:14,246 --> 01:39:15,566
cascading top to bottom.

2332
01:39:15,866 --> 01:39:18,186
Here are all of the rules
that apply to that element.

2333
01:39:18,266 --> 01:39:21,686
So, here there's apparently
somewhere in Google CSS

2334
01:39:21,686 --> 01:39:26,036
and apparently the file called
search on line six, an a link,

2335
01:39:26,036 --> 01:39:28,546
a .w class mentioned
and all of this

2336
01:39:28,826 --> 01:39:31,096
where they're specifying the
color of the link and the cursor

2337
01:39:31,166 --> 01:39:32,386
that should be used over it.

2338
01:39:32,706 --> 01:39:33,926
So let me just try it for kicks.

2339
01:39:33,926 --> 01:39:36,976
Let me go in here and
let me change this

2340
01:39:36,976 --> 01:39:41,296
to let's say completely
random like orange.

2341
01:39:41,716 --> 01:39:42,876
So notice what I've
done up there.

2342
01:39:42,946 --> 01:39:44,506
I've changed the
Google's link to orange

2343
01:39:44,506 --> 01:39:47,636
of course not permanently and
only for the links in the page

2344
01:39:47,636 --> 01:39:49,266
for which that CSS rule applies.

2345
01:39:49,576 --> 01:39:51,526
But the point here is that this
is just a wonderfully quick

2346
01:39:51,526 --> 01:39:53,926
and dirty way of experimenting
especially if you're quite eager

2347
01:39:54,076 --> 01:39:56,656
and you want to get like
the pixel alignment perfect

2348
01:39:56,656 --> 01:39:56,936
in something.

2349
01:39:57,146 --> 01:40:00,756
Being able to just tweak it ever
so slightly here and then figure

2350
01:40:00,756 --> 01:40:02,826
out what the values
are and then write them

2351
01:40:02,926 --> 01:40:04,696
in the actual file
on the server.

2352
01:40:04,696 --> 01:40:06,256
It's just wonderfully useful.

2353
01:40:06,896 --> 01:40:08,126
Also too, if you
are trying to figure

2354
01:40:08,126 --> 01:40:10,026
out what font a web site uses.

2355
01:40:10,346 --> 01:40:11,886
I can go to computed style

2356
01:40:11,886 --> 01:40:13,766
because this can be a little
overwhelming like, "My God.

2357
01:40:13,766 --> 01:40:15,886
There's so many rules
that apply to this element

2358
01:40:15,886 --> 01:40:17,446
because of the cascading
nature of CSS."

2359
01:40:17,446 --> 01:40:21,246
Let me just look at the computed
styles which is the summary

2360
01:40:21,246 --> 01:40:24,556
of the end result of all of the
styles that have been applied

2361
01:40:24,856 --> 01:40:30,606
and if look for font family,
it indeed this is apply--

2362
01:40:30,666 --> 01:40:32,966
this is Arial followed
by Sans Serif.

2363
01:40:33,276 --> 01:40:35,326
So I know what font
now Google is using.

2364
01:40:35,326 --> 01:40:37,246
So just a debugging trick
it you've not used it,

2365
01:40:37,536 --> 01:40:39,286
this and the Network
tab we'll end

2366
01:40:39,286 --> 01:40:41,326
up using quite a
bit most likely.

2367
01:40:41,806 --> 01:40:43,716
So now back to our
own version of Google.

2368
01:40:43,906 --> 01:40:47,676
Unfortunately, if I type in
Harvard and hit Google search,

2369
01:40:48,016 --> 01:40:50,586
it doesn't go anywhere
sort of-- it kind of did.

2370
01:40:50,586 --> 01:40:51,446
Where did it end up?

2371
01:40:52,726 --> 01:40:53,996
The URL is almost the same.

2372
01:40:53,996 --> 01:40:56,576
It's a crazy looking URL
because its obviously a file

2373
01:40:56,576 --> 01:40:58,056
on my hard drive
not in the internet.

2374
01:40:58,346 --> 01:40:59,756
But what did change in the URL?

2375
01:40:59,856 --> 01:40:59,946
Yeah?

2376
01:41:00,016 --> 01:41:01,096
[ Inaudible Remark ]

2377
01:41:01,096 --> 01:41:04,646
Good. So the question
mark got appended

2378
01:41:04,916 --> 01:41:08,636
but no parameters got
sent so let me go ahead

2379
01:41:08,636 --> 01:41:10,896
and take a look at
my file again.

2380
01:41:10,896 --> 01:41:12,596
Oh well, no parameters were sent

2381
01:41:12,596 --> 01:41:14,006
because I didn't give
any of them names.

2382
01:41:14,006 --> 01:41:15,086
So let me go back and fix this.

2383
01:41:15,086 --> 01:41:17,106
Let me shrink the font so
we can see more at once

2384
01:41:17,686 --> 01:41:22,496
and let me do input name equals
q here and let me go back

2385
01:41:22,496 --> 01:41:24,776
over here and reload the page.

2386
01:41:25,226 --> 01:41:27,396
And now let me go
ahead and type Harvard

2387
01:41:27,396 --> 01:41:28,796
and now click Google search.

2388
01:41:29,136 --> 01:41:30,326
So now we have some progress.

2389
01:41:30,806 --> 01:41:31,896
So this is interesting.

2390
01:41:32,096 --> 01:41:35,376
Now unfortunately, Google.html
is not a web site nor is

2391
01:41:35,376 --> 01:41:36,016
it dynamic.

2392
01:41:36,016 --> 01:41:37,396
It's literally just
a static file

2393
01:41:37,396 --> 01:41:39,476
so you can send any parameters
you want it's just going

2394
01:41:39,476 --> 01:41:40,566
to ignore you every time.

2395
01:41:41,046 --> 01:41:42,986
But what if I kind
of cheat here?

2396
01:41:43,456 --> 01:41:45,976
I'm kind of in a hurry to
implement my own search engine.

2397
01:41:45,976 --> 01:41:48,626
I-- You know, I already
knocked off my own news site,

2398
01:41:48,976 --> 01:41:50,546
now I want to do my
own search engine.

2399
01:41:50,806 --> 01:41:52,316
Well I can actually do this,

2400
01:41:52,686 --> 01:41:55,696
form action should
actually go to,

2401
01:41:55,696 --> 01:42:00,416
let's say www.google.com/search
and I'm going

2402
01:42:00,416 --> 01:42:04,396
to say method equals get
in all lower case here,

2403
01:42:04,666 --> 01:42:07,266
slight inconsistency with
what we've seen before.

2404
01:42:07,486 --> 01:42:10,076
And now, let me go back
to this page, oops,

2405
01:42:11,306 --> 01:42:13,746
reload and now let
me type Harvard

2406
01:42:13,746 --> 01:42:16,756
and watch the URL
as I hit enter.

2407
01:42:16,756 --> 01:42:19,646
Ah, now I have implemented my
own version of Google but how?

2408
01:42:20,196 --> 01:42:23,476
Right. All I did was I
constructed a form I specified a

2409
01:42:23,476 --> 01:42:25,826
method of gets an
action that happens

2410
01:42:25,826 --> 01:42:28,556
to be a point B elsewhere
but because of HTTP

2411
01:42:28,556 --> 01:42:30,506
and because the browser
knows how to handle forms,

2412
01:42:30,566 --> 01:42:34,006
it compiled all of the key value
pairs, in this case just one,

2413
01:42:34,006 --> 01:42:36,936
q equals something put
it in the URL and sent it

2414
01:42:37,156 --> 01:42:38,936
to that action attribute.

2415
01:42:39,056 --> 01:42:40,646
Sent it to that particular URL.

2416
01:42:40,866 --> 01:42:44,276
So now we have implemented
our own version of Google.

2417
01:42:44,386 --> 01:42:47,086
Now of course it would be nice
if it were our search results

2418
01:42:47,086 --> 01:42:49,376
and we weren't completely
cutting corners here but that's

2419
01:42:49,376 --> 01:42:52,786
where we'll need something like
PHP to do things server side.

2420
01:42:53,926 --> 01:42:55,326
Let me pause for just a moment.

2421
01:42:55,326 --> 01:42:56,706
Peter, do you want to
say hello to the class?

2422
01:42:56,706 --> 01:42:58,796
Peter is one of our four
teaching fellows the others are

2423
01:42:58,796 --> 01:42:59,656
at work right now.

2424
01:43:00,016 --> 01:43:02,596
Do you want to say actual hello?

2425
01:43:02,596 --> 01:43:02,796
>> Yeah.

2426
01:43:02,796 --> 01:43:04,366
>> I have to come near you
with the microphone though.

2427
01:43:04,476 --> 01:43:05,706
Yeah. Why don't you
come this way

2428
01:43:05,706 --> 01:43:07,756
so the camera's a little
more readily available?

2429
01:43:07,756 --> 01:43:09,586
>> Hello all.

2430
01:43:09,586 --> 01:43:11,616
I look forward to
working with all of you

2431
01:43:11,616 --> 01:43:13,116
and I will see you on Wednesday.

2432
01:43:13,386 --> 01:43:14,136
>> OK. Excellent.

2433
01:43:14,216 --> 01:43:14,826
Thanks very much.

2434
01:43:15,166 --> 01:43:19,976
Any questions then about
Google, fake news, HTTP?

2435
01:43:20,646 --> 01:43:20,976
Yeah?

2436
01:43:21,516 --> 01:43:25,756
[ Inaudible Remark ]

2437
01:43:26,256 --> 01:43:26,716
Good question.

2438
01:43:26,876 --> 01:43:27,226
Let's see.

2439
01:43:27,806 --> 01:43:29,726
Had it not been named
q, would this be broken?

2440
01:43:29,726 --> 01:43:30,996
So let me go back into here

2441
01:43:31,266 --> 01:43:33,626
and let's just call it query
thinking that's a reasonable

2442
01:43:33,626 --> 01:43:34,806
name to give a parameter to.

2443
01:43:34,806 --> 01:43:38,386
Let me reload my HTML, let
me type in Harvard, enter.

2444
01:43:39,186 --> 01:43:41,106
And interesting,
they support query.

2445
01:43:41,176 --> 01:43:43,576
So let me misspell it.

2446
01:43:44,246 --> 01:43:46,656
OK. That's probably not
supported but maybe let's see.

2447
01:43:46,656 --> 01:43:50,576
OK. Let me reload,
type Harvard, enter.

2448
01:43:51,276 --> 01:43:52,286
OK. That just didn't work.

2449
01:43:52,546 --> 01:43:53,896
So they support query or q

2450
01:43:53,896 --> 01:43:55,766
for whatever backwards
compatibility reasons.

2451
01:43:56,716 --> 01:43:57,266
Good question.

2452
01:43:57,796 --> 01:43:58,416
Other questions?

2453
01:43:59,356 --> 01:43:59,446
Yeah.

2454
01:44:00,516 --> 01:44:06,546
[ Inaudible Remark ]

2455
01:44:07,046 --> 01:44:10,046
Yes. On the entirety

2456
01:44:10,046 --> 01:44:12,626
of Harvard's campus your
sessions can be hijacked.

2457
01:44:13,066 --> 01:44:17,016
So, if you have malicious
room mates or you are--

2458
01:44:17,226 --> 01:44:18,936
there's more technical
people around you,

2459
01:44:18,936 --> 01:44:19,886
you are vulnerable to this.

2460
01:44:19,926 --> 01:44:21,716
Session hijacking can
happen can happen anywhere

2461
01:44:22,006 --> 01:44:23,746
where encryption is not used.

2462
01:44:23,746 --> 01:44:25,396
Whether between you
and the access point,

2463
01:44:25,756 --> 01:44:27,556
the wireless access
point or between you

2464
01:44:27,606 --> 01:44:30,596
and the end point web server and
I should say especially those

2465
01:44:30,596 --> 01:44:31,746
of you who are from out of town,

2466
01:44:31,746 --> 01:44:33,936
realize that Harvard does
have fine print and rules

2467
01:44:33,936 --> 01:44:36,436
about not doing this to
other people otherwise they--

2468
01:44:36,836 --> 01:44:39,196
you can solve this problem
by expelling people,.

2469
01:44:39,196 --> 01:44:40,936
That's one way if you can't
do it technologically.

2470
01:44:41,266 --> 01:44:42,946
So, this is one of these
things where we're trying

2471
01:44:42,946 --> 01:44:45,386
to educate you as to how it can
be done but don't go trying this

2472
01:44:45,426 --> 01:44:46,416
in the dorms on campus.

2473
01:44:46,846 --> 01:44:48,366
Wait until you go home
on your own home network.

2474
01:44:49,676 --> 01:44:50,486
Other questions?

2475
01:44:50,976 --> 01:44:51,186
All right.

2476
01:44:52,076 --> 01:44:53,426
So where does this leaves us?

2477
01:44:53,426 --> 01:44:55,696
So we started by talking about
Google and we keep talking

2478
01:44:55,696 --> 01:44:57,256
about Google but just
because it's so popular

2479
01:44:57,446 --> 01:44:59,806
but the story applies to
really, any web site out there.

2480
01:45:00,186 --> 01:45:04,826
So, we talked about DNS and
the process of not only looking

2481
01:45:04,826 --> 01:45:07,616
up a web page's URL
or rather IP address

2482
01:45:07,686 --> 01:45:09,556
but also getting
your own IP address

2483
01:45:09,556 --> 01:45:12,476
and getting your own web server
and your own domain name.

2484
01:45:12,736 --> 01:45:15,356
We talked a little
bit about HTML forms

2485
01:45:15,356 --> 01:45:17,606
which you might be familiar
already but some of the tools

2486
01:45:17,606 --> 01:45:19,456
with which you can get a
little more comfortable

2487
01:45:19,456 --> 01:45:22,096
with diagnosing things
and debugging things

2488
01:45:22,096 --> 01:45:24,306
and we've written some
HTML that submits a form.

2489
01:45:24,766 --> 01:45:25,886
All that we haven't done

2490
01:45:25,886 --> 01:45:27,956
yet is actually implement
a dynamic web site.

2491
01:45:27,956 --> 01:45:29,886
For that we've completely
out sourced through CNN

2492
01:45:29,886 --> 01:45:31,436
and Google today but one

2493
01:45:31,436 --> 01:45:33,726
of the first things we will do
this coming Wednesday is dive

2494
01:45:33,726 --> 01:45:36,776
in deeper to PHP and
things like get and post

2495
01:45:36,806 --> 01:45:38,496
and sessions and shopping carts.

2496
01:45:38,716 --> 01:45:40,746
Some of the security
implications around them,

2497
01:45:40,746 --> 01:45:42,526
we'll dive into the
CS50 appliance

2498
01:45:42,566 --> 01:45:43,986
and this virtual
machine environments

2499
01:45:43,986 --> 01:45:46,106
with which you'll
get to play in terms

2500
01:45:46,106 --> 01:45:48,336
of Apache and PHP and MySQL.

2501
01:45:48,906 --> 01:45:50,926
So tonight, we'll
adjourn a bit early.

2502
01:45:50,926 --> 01:45:52,166
I'll stick around with Peter

2503
01:45:52,166 --> 01:45:54,436
for any one-on-one questions
you might have but otherwise,

2504
01:45:54,436 --> 01:45:56,006
we will see you again
on Wednesday

2505
01:45:56,006 --> 01:45:58,906
and after Wednesday's lecture,
we'll flow right into section

2506
01:45:58,906 --> 01:46:00,556
and office hours of
you have questions

2507
01:46:00,556 --> 01:46:03,136
about content even though the
first project won't be released

2508
01:46:03,136 --> 01:46:03,876
for a week or so.

2509
01:46:04,066 --> 01:46:04,616
All right.

2510
01:46:04,616 --> 01:46:05,476
See you in a couple of days.

2511
01:46:06,516 --> 01:46:18,660
[ Silence ]

