Module 2 - Creating Data Types and Tables
Data Types
-system supplied data types
-alias data types (aka user defined data types)
-third “user defined types” (managed code addressed later)
System Supplied datatypes:
-All described here we use these every day

Notes:
-a global identifier id is generated with the “new id” function.
-Special datatype timestamp is not really a date/time is a row identifier and there for row identity purposes. Sql_variant can contain different sorts of data…
Alias Data Types
Based on system types
Used for data elements with a specific format
Created with the CREATE TYPE statement
Syntax:
CREATE Type dbo.zip
FROM char(5)
NOT NULL
Can also be created in SSMS
DBàProgrammabilityàTypesàUser-Defined Data Typesàrcànew type.
Tables
How does SQL Server organize rows in a table
Rows can only be 8060k

I’m still a little weary of this diagram, its on page 2-13
Here is what mark says:
SQL server starts each row with a 4 byte
**header
Contains housekeeping information like:
Number of fixed data length rows
a pointer to the null block (talk about this later)
And things like that
Following the header there are the data rows that hold
**fixed data
-in sql server fixed data rows will always be the same length no matter how much data they actually hold.
Following the fixed data section is whats called a
**null block
the null block has house keeping information about the fixed data section.
· how many fixed data columns are in this row?
· nullability section that has a bitmap that describes that for each one of those columns, is that column nullable. 1 if nullable, 0 if not
· number of columns in the fixed data section with pointers to those columns
Following the null Block (if there are any variable data length rows) is the
**Variable Block
· how many variable data columns are in this row?
· Pointer to the first byte of the each variable data column.
Large Data Value
Organization
Large datatypes are going away
-there is now a “max” identifier for our text/ntext/binary data
- with the text/ntext/binary data using max identifier the default is that
The above is true UNLESS you toggle the LARGE DATA TYPES OUT OF ROW ON/OFF setting in the sp_tableoption system sproc. The default is OFF.

Considerations
-Collation
-sort order of characters in a column
-characters that are available
-case sensitive or not
-Nullability
-default can be set by admin
-suggests you specify this so admin can not override
-Special column types
-Computed type a virtual column that is not physically stored in the table.
-Identity type with integer like data system generated sequential numbering goes here based on your specification.
-timestamp type is there for backward compatibility (was rowidentifier) it appears to create a unique identifier based on the time that is guaranteed to be unique within the database.
-uniqueidentifier populated with the new ID function (GUID) unlike identity
Creating the table:
SSMS à Right click Tablesà New Table
TSQL Syntax:
CREATE TABLE schema.tablename
(field
int,
Filed varchar)
Partitioning Tables
-A partition table takes data rows of the table and places them in different file groups on the basis of the partition scheme and function.
Partitioned horizontally.
Partition function specifies 2 things
-data type (i.e. data)
-ranges of values (boundries)
-the partition function defines the partition boundries, the boundary values can be assigned LEFT or RIGHT
Syntax:
CREATE PARTITION FUNCTION pf_DataOfOrder (datetime)
AS RANGE RIGHT
FOR VALUES (‘01/01/2003’, ‘01/01/2004’,
‘01/01/2005’)
Above Boundaries Description:
Partition 1 will have all records before 1/1/03
Partition 2 will have all records from 03-04
Partition 3 will have all records from 04-05
Partition 4 will have all current data???
A partition SCHEME is what is going to actually partition the table.
The SCHEME is what references the partition function.
Number of file groups needs to match the number of partitions defined in the partition function.
Syntax:
--
Create partition function
USE AdventureWorks
GO
CREATE PARTITION FUNCTION pf_OrderDate (datetime)
AS
RANGE RIGHT
FOR
VALUES ('01/01/2004', '01/01/2005')
/*
Partition
1 (in fg1) contains all 2003 data i.e. <1/1/2004
Partition
2 (in fg2) contains all 2004 data i.e. <1/1/2005
Partition
3 (in fg3) will fill up with new data
Partition
4 (in fg4) is marked as the next used filegroup in
partition scheme 'ps_OrderDate' ???
no idea what this means
*/
--
Add filegroups and create partition scheme
ALTER
DATABASE AdventureWorks ADD FILEGROUP fg1
ALTER
DATABASE AdventureWorks ADD FILEGROUP fg2
ALTER
DATABASE AdventureWorks ADD FILEGROUP fg3
ALTER
DATABASE AdventureWorks ADD FILEGROUP fg4
GO
ALTER
DATABASE AdventureWorks
ADD
FILE
( NAME =
data1,
FILENAME = 'c:\Program
Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\AWd1.ndf',
SIZE = 1MB,
MAXSIZE = 100MB,
FILEGROWTH = 1MB)
TO
FILEGROUP fg1
GO
ALTER
DATABASE AdventureWorks
ADD
FILE
( NAME =
data2,
FILENAME = 'c:\Program
Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\AWd2.ndf',
SIZE = 1MB,
MAXSIZE = 100MB,
FILEGROWTH = 1MB)
TO
FILEGROUP fg2
GO
ALTER
DATABASE AdventureWorks
ADD
FILE
( NAME =
data3,
FILENAME = 'c:\Program
Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\AWd3.ndf',
SIZE = 1MB,
MAXSIZE = 100MB,
FILEGROWTH = 1MB)
TO
FILEGROUP fg3
GO
ALTER
DATABASE AdventureWorks
ADD
FILE
( NAME =
data4,
FILENAME = 'c:\Program
Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\AWd4.ndf',
SIZE = 1MB,
MAXSIZE = 100MB,
FILEGROWTH = 1MB)
TO
FILEGROUP fg4
GO
CREATE PARTITION SCHEME ps_OrderDate
AS
PARTITION pf_OrderDate
TO
(fg1, fg2, fg3, fg4)
GO
--Partition
scheme 'ps_OrderDate' has been created successfully.
'fg4' is marked as the next used filegroup in partition scheme 'ps_OrderDate'.
--If a partition
scheme does not have a defined “NEXT” filegroup an
error will occur.
--
Create partitioned table
CREATE TABLE dbo.PartitionedTransactions
(
TransactionID
int IDENTITY(1,1) NOT NULL,
ProductID
int NOT NULL,
TransactionDate
datetime NOT NULL DEFAULT (getdate()),
TransactionType
nchar(1)
NOT NULL
)
ON ps_OrderDate(TransactionDate)
GO
--
Insert data
INSERT
INTO dbo.PartitionedTransactions
SELECT ProductID, TransactionDate, TransactionType
FROM Production.TransactionHistory
GO
INSERT
INTO dbo.PartitionedTransactions
VALUES
(1,
'01/01/2005', 'S')
GO
--
View partition metadata
SELECT
* FROM sys.Partitions
WHERE
[object_id] = OBJECT_ID('dbo.PartitionedTransactions')
/*
partition_id object_id index_id partition_number hobt_id
rows
--------------------
----------- ----------- ---------------- --------------------
--------------------
72057594054246400 871674153
0 1 72057594054246400 42844
72057594054311936 871674153
0 2 72057594054311936 70599
72057594054377472 871674153
0 3 72057594054377472 1
*/
--
View data with partition number
select $Partition.pf_OrderDate(TransactionDate)
as PartitionNo, max(TransactionDate),
min(TransactionDate)
FROM dbo.PartitionedTransactions
group by $Partition.pf_OrderDate(TransactionDate)
order by $Partition.pf_OrderDate(TransactionDate)
/*
PartitionNo
-----------
----------------------- -----------------------
1 2003-12-31 00:00:00.000 2003-09-01
00:00:00.000
2 2004-09-03 00:00:00.000 2004-01-01
00:00:00.000
3 2005-01-01 00:00:00.000 2005-01-01
00:00:00.000
*/
--
Verify lowest value in each partition
SELECT
MIN(TransactionDate) FirstTran, $Partition.pf_OrderDate(TransactionDate) PartitionNo
FROM dbo.PartitionedTransactions
GROUP
BY $Partition.pf_OrderDate(TransactionDate)
ORDER
BY PartitionNo
/*
FirstTran PartitionNo
-----------------------
-----------
2003-09-01
00:00:00.000 1
2004-01-01
00:00:00.000 2
2005-01-01
00:00:00.000 3
*/
/*
left this for reference (not used here)
--
Reset database
DROP
TABLE dbo.PartitionedTransactions
DROP
PARTITION SCHEME ps_OrderDate
DROP
PARTITION FUNCTION pf_OrderDate
ALTER
DATABASE AdventureWorks REMOVE FILE data1
ALTER
DATABASE AdventureWorks REMOVE FILE data2
ALTER
DATABASE AdventureWorks REMOVE FILE data3
ALTER
DATABASE AdventureWorks REMOVE FILE data4
ALTER
DATABASE AdventureWorks REMOVE FILEGROUP fg1
ALTER
DATABASE AdventureWorks REMOVE FILEGROUP fg2
ALTER
DATABASE AdventureWorks REMOVE FILEGROUP fg3
ALTER
DATABASE AdventureWorks REMOVE FILEGROUP fg4
8?
What can you do with
partitioned Data? (page 2-32)
SWITCH – Allows us to exchange a populated table or partition with an empty table or partition.
Note: The way I understand
it is this. You take the chunk of data that is in partition one and plop it
into an Archive table thus making your table smaller.
Like Prp to PrpArchive.
SELECT
top 0 *
into PartitionedTransactionsArchive
from PartitionedTransactions
--have
to move this to fg1 I manually did it but I dont know
the syntax to do it.
ALTER
TABLE dbo.PartitionedTransactions
SWITCH
PARTITION 1
TO PartitionedTransactionsArchive
select top
10 * from PartitionedTransactionsArchive
select max(TransactionDate),min(TransactionDate)
from PartitionedTransactionsArchive
/*
-----------------------
-----------------------
2003-12-31
00:00:00.000 2003-09-01 00:00:00.000
*/
select max(TransactionDate),min(TransactionDate)
from PartitionedTransactions
/*
-----------------------
-----------------------
2004-09-03
00:00:00.000 2004-01-01 00:00:00.000
*/
MERGE – Takes two partitions and squishes them into one partition.
Uses the ALTER PARTITION FUNCTION
When you perform a merge operation the partition for the boundary
value specified in the ALTER PARTITION FUNCTION statement is remoe, the data is merged into the adjacent partition.
Here is the create partition function from above:
CREATE PARTITION FUNCTION [pf_OrderDate](datetime)
AS RANGE RIGHT FOR VALUES (N'2004-01-01
00:00:00', N'2005-01-01 00:00:00')
/*
Partition
1 (in fg1) contains all 2003 data i.e. <1/1/2004
Partition
2 (in fg2) contains all 2004 data i.e. <1/1/2005
Partition
3 (in fg3) will fill up with new data
Partition
4 (in fg4) is marked as the next used filegroup in
partition scheme 'ps_OrderDate' ???
no idea what this means
--If a partition
scheme does not have a defined “NEXT” filegroup an
error will occur.
*/
Merge code:
ALTER
PARTITION FUNCTION pf_OrderDate()
This is what the function now looks like this (SSMSàDBàStorageàPartition Functionsàpf_orderdate)
CREATE PARTITION FUNCTION [pf_OrderDate](datetime)
AS RANGE RIGHT FOR VALUES (N'2005-01-01
00:00:00')
/*
Partition
1 (in fg1) contains all 2003 data i.e. <1/1/2004
Partition
2 (in fg2) contains all 2004 data i.e. <1/1/2005
Partition
3 (in fg3) will fill up with new data
Partition
4 (in fg4) is marked as the next used filegroup in
partition scheme 'ps_OrderDate' ???
no idea what this means
--If a partition
scheme does not have a defined “NEXT” filegroup an
error will occur.
*/
Here is all of the code needed to duplicate the above:
/*
MERGE functionality*/
/*********************************************************/
--a
create script of our current [pf_OrderDate] function
USE [AdventureWorks]
GO
/******
Object: PartitionFunction
[pf_OrderDate]
Script
Date: 03/04/2008 16:05:08 ******/
CREATE
PARTITION FUNCTION [pf_OrderDate](datetime)
AS
RANGE RIGHT FOR VALUES (N'2004-01-01 00:00:00', N'2005-01-01 00:00:00')
/*********************************************************/
--running
the merge operation:
ALTER
PARTITION FUNCTION pf_OrderDate()
/*********************************************************/
--a
create script of our new [pf_OrderDate] function
USE [AdventureWorks]
GO
/******
Object: PartitionFunction
[pf_OrderDate]
Script
Date: 03/04/2008 16:07:09 ******/
CREATE
PARTITION FUNCTION [pf_OrderDate](datetime)
AS
RANGE RIGHT FOR VALUES (N'2005-01-01 00:00:00')
/**********************************************************/
--
View partition metadata
SELECT
* FROM sys.Partitions
WHERE
[object_id] = OBJECT_ID('dbo.PartitionedTransactions')
/*
BEFORE:
partition_id object_id index_id partition_number hobt_id
rows
-------------------- -----------
----------- ---------------- -------------------- --------------------
72057594054246400 871674153
0 1 72057594054246400 42844
72057594054311936 871674153
0 2 72057594054311936 70599
72057594054377472 871674153
0 3 72057594054377472 1
*/
/*
AFTER:
72057594054246400 871674153 0 1 72057594054246400 113443
72057594054377472 871674153 0 2 72057594054377472 1
*/
/****************************************************************************
Partition
1 and 2 were merged and the partitions were renumbered.
*****************************************************************************/
--
View data with partition number
select $Partition.pf_OrderDate(TransactionDate)
as PartitionNo, max(TransactionDate),
min(TransactionDate)
FROM dbo.PartitionedTransactions
group by $Partition.pf_OrderDate(TransactionDate)
order by $Partition.pf_OrderDate(TransactionDate)
/* BEFORE
PartitionNo
----------- -----------------------
-----------------------
1 2003-12-31 00:00:00.000 2003-09-01 00:00:00.000
2 2004-09-03 00:00:00.000 2004-01-01 00:00:00.000
3 2005-01-01 00:00:00.000 2005-01-01 00:00:00.000
*/
/*AFTER
PartitionNo
----------- -----------------------
-----------------------
1 2004-09-03 00:00:00.000 2003-09-01 00:00:00.000
2 2005-01-01 00:00:00.000 2005-01-01 00:00:00.000
*/
SPLIT – The opposite of merge this will split one partition into two.
Now I want to split partition 1 back into 2003 data and 2004 data
--old
func
USE [AdventureWorks]
GO
/******
Object: PartitionFunction
[pf_OrderDate]
Script Date: 03/04/2008 16:23:14 ******/
CREATE
PARTITION FUNCTION [pf_OrderDate](datetime) AS RANGE RIGHT FOR VALUES (N'2005-01-01
00:00:00')
/*begin
split operation*/
ALTER PARTITION FUNCTION pf_OrderDate()
--new
func
USE [AdventureWorks]
GO
/******
Object: PartitionFunction
[pf_OrderDate]
Script Date: 03/04/2008 16:23:32 ******/
CREATE
PARTITION FUNCTION [pf_OrderDate](datetime) AS RANGE RIGHT FOR VALUES (N'2004-01-01 00:00:00', N'2005-01-01 00:00:00')
/*BEFORE
PartitionNo
----------- -----------------------
-----------------------
1 2004-09-03 00:00:00.000 2003-09-01 00:00:00.000
2 2005-01-01 00:00:00.000 2005-01-01 00:00:00.000
*/
/*AFTER
PartitionNo
----------- -----------------------
-----------------------
1 2003-12-31 00:00:00.000 2003-09-01 00:00:00.000
2 2004-09-03 00:00:00.000 2004-01-01 00:00:00.000
3 2005-01-01 00:00:00.000
2005-01-01 00:00:00.000
*/
YAY!!! Weee
End of Module 2!!
QUIZ Info:
The smallest allocable unit of storage is an extent, eight contiguous pages.
Collation can be set for a server, database, and a column
XML data has methods
A real datatype is an approximate numeric datatype