You are on page 1of 58

Tm hiu v Data Warehouse

LI CM N

Trc ht, em xin chn thnh gi li cm n su sc n c gio Ths.Nguyn Th Xun Hng, ngi tn tnh hng dn v to mi iu kin cho em trong qu trnh lm tt nghip. Em xin chn thnh cm n cc thy c gio trong khoa Cng Ngh Thng Tin Trng i Hc Dn Lp Hi Phng truyn t nhng kin thc qu bu v gip em trong sut bn nm hc v trong qu trnh lm tt nghip va qua. Em xin trn trng cm n thy Trn Hu Ngh - Hiu trng trng i Hc Dn Lp Hi Phng ng h, ng vin, v to mi iu kin tt nht cho chng em trong thi gian hc tp ti trng. Cui cng ti xin gi li cm n chn thnh ti tt c nhng ngi thn cng bn b ng vin, gip v ng gp nhiu kin qu bu cho ti trong qu trnh hc tp cng nh khi lm tt nghip. Hi Phng, thng 7 nm 2010 Sinh vin

Nguyn Th Mai Hng

Trang -1127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse

MC LC
LI CM N .................................................................................................. 1 LI NI U ................................................................................................. 5 Chng 1. GII THIU V KHO D LIU ............................................ 7 1.1. Lch s pht trin ca kho d liu ...................................................... 7 1.2. Kho d liu l g (What is the data warehouse)? ............................. 12 1.3. c im .............................................................................................. 13 1.4. Mc ch ca kho d liu ................................................................... 13 1.5. Mc tiu ca kho d liu .................................................................... 14 1.5.1. Truy cp d dng .................................................................................. 14 1.5.2. Thng tin nht qun ............................................................................ 14 1.5.3. Thch nghi vi s thay i ................................................................. 14 1.5.4. H tr ra quyt nh ............................................................................ 14 1.5.5. Bo mt ..................................................................................................... 14 1.6. Cc chc nng chnh: ......................................................................... 15 1.7. Li ch: ................................................................................................. 15 1.8. c tnh ca kho d liu .................................................................... 15 1.9. Cu trc d liu cho kho d liu ....................................................... 16 1.10. Kin trc ca mt h thng kho d liu ......................................... 17 1.11. Mi quan h gia kho d liu v khai ph d liu ........................ 18 1.12. Cc lnh vc ng dng ..................................................................... 18 Chng 2. CC YU T C BN CA KHO D LIU........................ 19 2.1. Kiu ca d liu v cch s dng ..................................................... 19 2.1.1. Kiu ca d liu (Types of data) ..................................................... 19 2.1.1.1. ngha ..................................................................................... 19 2.1.1.2. Cu trc ................................................................................... 19 2.1.1.3. Phm vi(Scope) ........................................................................ 19 2.1.2. D liu cng vic (Business data) .................................................... 20 2.1.2.1. nh ngha ............................................................................... 20 2.1.2.2. Tiu chun cho kiu ca d liu cng vic: ............................ 20
Trang -2127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse 2.1.2.3. Ba kiu ca d liu cng vic: ................................................. 21 2.1.3. Siu d liu(Meta data) ...................................................................... 24 2.1.3.1. Khi nim ................................................................................. 24 2.1.3.2. Mc ch .................................................................................. 24 2.1.3.3. Metadata phi cha cc thng tin: .......................................... 25 2.1.3.4. Tc dng ca metadata ............................................................ 25 2.1.3.5. Tiu chun cho cc kiu siu d liu ....................................... 25 2.1.3.6. Ba loi siu d liu .................................................................. 26 2.1.4. D liu vt qu phm vi ca kho d liu (Data beyond the scope of the Data Warehouse) .......................................................................... 29 2.1.4.1. D liu ging nh mt sn phm(Data as a product) ........ 29 2.1.4.2. D liu cng vic c nhn v siu d liu ............................ 29 2.1.5. D liu bn trong v bn ngoi (Internal and external data)30 2.1.6. Kt lun:................................................................................................... 31 2.2. Khi nim kin trc d liu(Conceptual data architecture): ......... 32 2.2.1. Cc kin trc d liu cng vic (Business data architectures)
..................................................................................................................... 32 2.2.2. Kin trc n lp d liu (The single-layer data architecture) .. ..................................................................................................................... 33 2.2.3. Kin trc hai lp d liu (The two-layer data architecture) .... 34 2.2.4. Kin trc ba lp d liu (The three-layer data architecture) .. 35

Chng 3. ........................................................................................................ 38 GII THIU KIN TRC LOGIC KHO D LIU .................................. 38 3.1. D liu cng vic trong kho d liu (Business data in the data warehouse) .................................................................................................. 38 3.1.1. Cc h thng vn hnh (Operational systems) ........................... 38 3.1.2. Kho d liu cng vic (The business data warehouse) ............ 38 3.1.3. Cc kho thng tin cng vic ( Business information warehouses BIW) 39 3.2. Cc vn khc ca d liu cng vic (Business data - other considerations) ........................................................................................... 40 3.2.1 Cc nhu cu d liu c bit (Special data needs) ............................. 40 3.2.2. Nhn t c bn cho lung d liu duy nht ( The rationate for uniditrecional data flow) ....................................................................................... 41 3.2.3. H tr "i chiu" cc lung d liu (Supporting " reverse " data flows): ..................................................................................................................... 41
Trang -3127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse 3. 2. 4. D liu c nhn (Personal data ).......................................................... 41 3.3. D liu bn ngoi. ............................................................................... 42 3.3.1. Thng tin qun l bn ngoi( Exteral management information): .......................................................................................................... 42 3.3.2. Trao i d liu in t (Electronic data interchange - EDI): .
..................................................................................................................... 43

3.4. Siu d liu trong kho d liu (Metadata in the Data warehouse) 44 3.5. Danh mc kho d liu (The data warehouse catalog -DWC): ....... 44 3.6. Cc h thng vn hnh (Operational systems) ................................ 46 3.7. Chc nng kho d liu (Data warehouse functionality): ................ 46 Chng 4. NGN NG CHO KHO D LIU ....................................... 49 4.1. Khi nim............................................................................................. 49 4.2. Bn cht ca OLAP ........................................................................... 49 4.3. OLAP tp trung vo cc cu lnh sau: ............................................. 49 4.4. i tng chnh ca OLAP ............................................................... 49 4.4.1. Khi (Cube)............................................................................................. 49 4.4.2. Chiu (Dimension) ................................................................................ 50 4.4.3. Cc n v o lng (Measures)...................................................... 51 4.4.4. Cc phn hoch (Partitions) ............................................................. 51 4.4.5. Mt v d v t chc kho d liu trong h thng gio dc ..... 51 KT LUN .................................................................................................... 57 TI LIU THAM KHO ............................................................................ 58

Trang -4127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse

LI NI U
Khi mt doanh nghip i vo hot ng, nhng nh qun l doanh nghip s phi t cc cu hi v c nhu cu mun bit v tnh hnh kinh doanh, tc tng trng, lng giao dch hng ngy, hng thng, hng qu, hng nm, so snh gia nm ny, nm khc, hoc phn khc cc khch hng ca doanh nghip, hoc phn tch doanh thu. i vi mi doanh nghip, h s t xy dng cho mnh mt h thng qun l giao dch (OLTP Online Transaction Procesing) hay chnh l cc ng dng (applications), chng trnh (software), h thng vn hnh (system) hng ngy ca doanh nghip. V d nh cc ngn hng, cc cng ty vin thng (h thng phi thu xy dng h thng chuyn bit). Tuy nhin cc h thng ny ch c thit k cho vic nhp d liu hng ngy hoc vn hnh h thng. Chng cng c kh nng cho php ly d liu cho mt s bo co n gin. Tuy nhin i vi nhng yu cu bo co theo nhiu chiu nh: loi khch hng, theo thi gian, i hi phi tnh ton phc tp th hu nh cc h thng ny rt kh thc hin. Mt khc cc doanh nghip ln nh ngn hng, vin thng, h phi c nhiu h thng con vn hnh song song vi nhau. V d: ngn hng th c phn h tin gi (c nhn, s tit kim), tin vay, kho qu. Vin thng th c tr trc, tr sau, bn hng. Nh th, thc hin c vic bo co, h phi tng hp d liu t nhiu h thng con khc nhau mi c th th thin c cc bo co mt cch tng th. Xut pht t nhng vn trn, h phi bt buc xy dng mt h thng na, chnh l mt c s d liu mi dnh cho vic truy vn v bo co phm
Trang -5127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse vi ton doanh nghip. Hay cn gi l kho d liu, l ni tng hp d liu t tt c cc h thng con li, thc hin vic tnh ton trn cc d liu ny v kt xut ra cc bng m d liu ca bng c tnh ton theo mt mc ch no . Kho d liu l mt hng cng ngh mi c s dng ph bin cho cc bi ton ln hin nay nh: qun tr doanh nghip, Y t, bo him, ngn hng, dn s, vin thng. Bi v vic xy dng kho d liu khng nhng gip cho doanh nghip lu tr mt lng thng tin ln hng ngy m cn gip cho cc nh qun l doanh nghip c th trch rt ngun ti nguyn mt cch nhanh chng, chnh xc. ng thi gip h phn tch v a ra cc bo co mt cch kp thi, gp phn thc y cho vic kinh doanh t kt qu tt nht. y cng l kin thc rt hu ch v cn thit c th khai thc ngy mt hiu qu cc thnh tu tin hc. cng l l do em chn ti ny lm n tt nghip. ti gm c 4 chng: Chng 1: Gii thiu v Kho d liu (Data warehouse), Chng 2: Cc yu t c bn ca Kho d liu, Chng 3: Gii thiu kin trc logic ca Kho d liu, Chng 4: Gii thiu v Ngn ng cho kho d liu: trong chng ny gii thiu v OLAP v trnh by mt v d xy dng kho d liu. V cui cng l phn kt lun.

Trang -6127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse

Chng 1.

GII THIU V KHO D LIU

1.1. Lch s pht trin ca kho d liu Khi nim ca kho d liu xut pht t vic tng hp ca hai tp nhu cu: - Yu cu thng mi cho cng ty m rng v bi cnh thng tin. - S cn thit ca cc h thng thng tin trong lnh vc qun l d liu cng ty mt cch tt nht. Vo nhng nm 1990, kho d liu tr thnh mt t thng dng ca cng nghip my tnh.

Hnh 1:Data warehouse evolution Cc cuc cch mng d liu u nm 1990: Phn ln cc kho trin khai trong thi k ny c khai sng bi cc t chc h thng thng tin. C th thy rng cc phng php tip cn trc khng mnh cung cp cc d liu h tr cho s pht trin trong tng lai v kh nng ngi s dng cc d liu s b suy yu do thiu iu kin doanh
Trang -7127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse nghip. S thnh cng ca thc hin ny thuyt phc ca cc nh qun l h thng thng tin, nhng ngi bn khi nim cho doanh nghip. Tip cn mi ny ph thuc vo cng ng doanh nghip trong s vic nhn ra s cn thit v gi tr ca tm nhn khi qut v d liu kinh doanh hn kh nng c trc . c bit, c mt ch ph bin l s dng d liu cho vic tip th v tng cng li th cnh tranh. Vo u thi k ny, nhiu ngnh cng nghip b thay i ng k trong mi trng kinh doanh. Quc t suy thoi ct gim li nhun, cc chnh ph bi b cc kim sot cht ch cc ngnh cng nghip, s gia tng cnh tranh trong th trng hng ha, chnh ph thay th th trng tp trung bng kinh t th trng nhiu thnh phn. iu ny cho thy cc yu cu v kinh doanh dn n cuc cch mng v d liu. Cng vic kinh doanh cn n tm nhn mi v vic cng ty c vn hnh nh th no, n bao trm cc nh hng phn chia trc ca cng vic kinh doanh. S thay i tp trung vo kho iu khin d liu kinh doanh thc hin to iu kin cho vic nh gi li cc li ch m kho c th cung cp. c tnh ca kho d liu trong thi k ny, khi h thng thng tin c iu khin thc hin, c gi nh l kho l ng n bng tit kim v gi v hiu qu c ci thin. S xut hin ny t cc tip cn h thng thng tin truyn thng iu chnh chi ph, da trn tnh vng chc trong m hnh iu khin ng dng. Thi i ca thng tin da trn qun l trong th k 21: Phn tch v mt l thuyt v vic thc hin ca kho d liu pht trin mnh t nhng nm 2000 tr v y. Tuy nhin, nhng b quyt kinh doanh, c h tr bi nhng ch dn k thut, c nh ngha trc y vn c th c xem nh l nhng ch dn quan trng ngy nay. Hin nay, chng ta ang s dng cc d liu ngun d on tng lai. Cha kha cho vic d on ny l cng nhn s s cn thit ca li th
Trang -8127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse cnh tranh l iu khin h tr c bn cho vic ra quyt nh t d liu hng n thng tin, v m rng i tng h tr vt ra ngoi ranh gii ca th trng qun l truyn thng.

Hnh 2:From data to information Hng ny c th c c trng bi thut ng: Qun l thng tin c s (Information-based management vit tt l IMB). L s chuyn i cch h tr quyt nh c giao cho cng ng ngi dng cui. N c th c tng hp thnh nm ch sau y: 1. Mt ngun thng tin duy nht: Cc d liu th mong mun t nhiu ngun khc nhau, gm d liu trong v d liu ngoi cng ty, v tn ti nhiu dng, t d liu c cu trc truyn thng, d liu phi cu trc, loi ti liu hoc a phng tin,.. D ngun d liu c kiu hay d liu th, trc khi c a vo mi trng ngi dng cui, n phi c lm sch v tng thch m bo cht lng v tnh ton vn ca n. Thng tin tng thch l duy nht, l ngun thng tin cui cng cho qun l thng tin c bn.

Trang -9127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse 2. Phn phi thng tin sn c: Qun l thng tin c bn khng ch duy nht mt chc nng chnh, nhng c nh gi cao v t chc phn b v v tr a l. Cc hot ng ny c th cn thit, v thng yu cu c lp, nhng cc kho thng tin kt ni logic d dng thay i, s thc hin, tng cng tin cy. 3. Thng tin trong mt bi cnh kinh doanh: Ngi dng c th hiu tt nht v x l thng tin khi n c t trong bi cnh hot ng kinh doanh m h tham gia. Cc nh ngha d liu c cung cp bi cc chuyn gia kinh doanh tr thnh chun, v danh mc cc thng tin bao gm cc nh ngha v hng vo ngi dng cui tr thnh ngun cho cc nh ngha d liu v h thng thng tin doanh nghip. 4. Truyn thng tin t ng: D liu c chuyn thnh thng tin v chuyn thng qua con ng ngy cng phc tp trong v gia cc t chc, c ch truyn t ng l cn thit. T ng ha cn thit khng ch trong qu trnh truyn thc t m cn trong vic nh ngha cc chuyn i d liu cn thit v s di chuyn. c bit trong lnh vc phn phi thng tin, cc tin ch ca cc c ch ny t ng phn phi phi c bo m. 5. Cht lng thng tin v quyn s hu (Information quality and ownership) Thng tin l mt s hu quan trng ca cng ty bt k, v ging nh bt k s hu khc, l phi qun l v bo v. Cht lng ca n phi c m bo. Quyn s hu ca ti liu v thng tin theo di l mt iu kin tin quyt nhn thc r gi tr ca s hu ny. Mi trng pht trin ngy nay(Todays development environment) 1. Pht trin ng dng phn tn (Fragmented application develop) Tt c cc cng c mi v cc cng ngh u c ng dng ti cc doanh nghip. Tuy nhin, cc cng c mi rt tn km v phi c p dng cc khu , v vy n phi bao gm phng php
Trang -10127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse tip cn mi c thc hin trong mt lot cc d n th im. iu ny cng c p dng trong l d liu. Cc yu t ny, cng vi s qun l c gii hn ca con ngi dn n thc hin phn mnh qu trnh x l d liu trong tt c cc hot ng kinh doanh. Doanh nghip hoc n v, a phng, t chc, c cc ng dng vn hnh ring thc hin nhng phn ca doanh nghip h m nhn. Phn mnh ny c th c thy trong cc v d nh sau: - Cc ng dng t hng khc nhau c s dng cho dng sn phm khc nhau trong cng mt cng ty. - Mt qu trnh hp l lin tc t t hng thng qua n thanh ton c tch ra trn mt s ng dng c lp da trn trch nhim ca t chc. S phn on ny em li mt s li ch. Vi cc ng dng c lp tp trung vo vic phn chia vng ca chc nn kinh doanh, cc d n c th nhn c chc nng ng dng xc nh nhm ngi dng cui vi cc yu cu nh ngha chun. 2. Pht trin ng dng vn hnh (Operational application development) Mi trng vn hnh c iu khin bi cc nhu cu ca doanh nghip cung cp hng ho hoc dch v. Do n c xc nh ch yu bi cc hot ng cn thit hn l bi cc d liu c s dng. S cn thit ca ngi dng c m t trn c s cc hot ng ngn hn. Phn tch c th tp trung vo nhng g l cn thit nhn mt n t hng, mt lch trnh giao hng, v tng t nh vy. H thng thng tin c th tp trung vo cc yu t u vo v u ra cn thit v cc hot ng xung quanh. Cc hot ng c nhn c th dn n cc ng dng c lp, mi ti u ha cho cc nhu cu ca hot ng lin quan ca n. Yu cu ngi s dng y c th c tng hp nh "t ng ho cc th tc ny". S thnh cng ca t ng ha c nh gi trn cc php o n gin bng vic thng qua mc tng hoc gim chi ph trong kinh doanh v v tnh d s dng hoc thi gian phn hi cp ca ngi s dng.

Trang -11127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse M hnh ny c s dng thnh cng x l d liu. Hu ht cc tnh ton kinh doanh c hng vo cc h thng hot ng. H thng thng tin c tm nhn hng ng dng. Mt ng dng n gin l mt tp cc chc nng cho ngi s dng c lin quan v c pht trin trong mt s cch tch hp. Tuy nhin, H thng thng in tch hp cc chc nng xc nh lm th no pht trin c phm vi ca d liu trong cc ng dng. 3. H tr quyt nh iu khin ng dng (Application driven decision support): T khi ng dng thng tin c s dng rng ri trn h thng my tnh, c mt khi lng ln d liu c lu tr v x l trn my tnh. Vn ng dng thng tin hin nay khng ch l lu tr vn hnh d liu, m cn l vic t chc cc ngun d liu rt trch thng tin v h tr ra quyt nh. y chnh l mt s tin ha cn thit cho cc h thng thng tin. 1.2.Kho d liu l g (What is the data warehouse)? Kho d liu (data warehouse), gi mt cch chnh xc hn l kho thng tin (information warehouse), l mt c s d liu hng i tng c thit k vi vic tip cn cc kin trong mi lnh vc c bit l trong lnh vc kinh doanh. N cung cp cc cng c p ng thng tin cn thit cho cc nh qun tr kinh doanh ti mi cp t chc - khng nhng l nhng yu cu d liu phc hp, m cn l iu kin thun tin nht t c vic ly thng tin nhanh, chnh xc. Mt kho d liu c thit k ngi s dng c th nhn ra thng tin m h mun c v truy cp n bng nhng cng c n gin. Mt kho d liu l mt s pha trn ca nhiu cng ngh, bao gm cc c s d liu a chiu v mi quan h gia chng, kin trc ch khch, giao din ngi dng ha v nhiu na. Nguyn nhn chnh cho s pht trin mt kho d liu l hot ng tch hp d liu t nhin ngun khc nhau vo mt kho d liu n l v dy c m kho ny cung cp cho vic phn tch v ra quyt nh trong cng vic kinh doanh, qun l. i vi mt s cng vic kinh doanh tin rng thng tin l ngun ti nguyn c gi tr rt ln th mt kho d liu tng i ging nh mt nh kho cha hng. H iu hnh to ra nhng phn d liu v np chng vo kho.
Trang -12127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse Mt s phn c tm tt trong thnh phn thng tin v c ct vo kho. Ngi s dng kho d liu a ra nhng yu cu v c cung cp sn phm c to ra t cc thnh phn v cc phn on c lu trong kho. Kho d liu l mt hng cng ngh nng nht. Mt kho d liu c xc nh ng hng, hot ng hiu qu c th tr thnh mt cng c cnh tranh c gi tr cao trong kinh doanh. 1.3. c im Trc tin Data Warehouse l c s d liu rt ln (very large database-VLDB). Data Warehouse thng ch c, phc v cho nhng nhu cu bo co, Data Warehouse hng v tnh n nh. Data Warehouse s ly thng tin c th t nhiu ngun khc nhau: DB2, Oracle, SQLserver thm ch c File thng thng ri lm sch chng v a vo cu trc ca n- l VLDB(very large database). Data Warehouse rt ln nn mun cho tng b phn chuyn bit ngi s dng cui cng c th khai thc thng d dng th bn thn Data Warehouse phi c chuyn ho, phn ra thnh nhng ch , do nhng ch chuyn mn ha to thnh mt c s d liu chuyn bit- l Data marts. C mt im lu y l c mt cng c hay ng hn l mt chun cng c m mi h qun tr c s d liu h tr cho vic truy vn thng tin trong Data marts ri a ra nhng quyt nh, nhn dnh nhng thng tin trong Datamart - l OLAP, b phn tch trc tuyn (Online Analyze Proceesing). 1.4. Mc ch ca kho d liu Mc ch chnh ca kho d liu l: - H tr cc nhn vin ca t chc thc hin tt, hiu qu cng vic ca mnh, nh c nhng quyt nh hp l, nhanh v bn c nhiu hng hn, nng sut cao hn, thu c li nhun cao hn, v. v. - Gip cho t chc, xc nh, qun l v iu hnh cc d n, cc nghip v mt cch hiu qu v chnh xc. - Tch hp d liu v cc siu d liu t nhiu ngun khc nhau
Trang -13127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse 1.5. Mc tiu ca kho d liu Mt Data Warehouse phi m bo c cc mc tiu sau: 1.5.1. Truy cp d dng Thng tin lu tr trong DW phi trc quan v d hiu vi ngi dng. D liu nn c trnh by thng qua cc tn gi quen thuc v gn gi vi nghip v ca ngi dng. Tc truy cp data warehouse phi nhanh. Do phi x l mt s lng bn ghi ln cng mt lc nn y l mt trong nhng yu cu cn phi c ca mt DW 1.5.2. Thng tin nht qun D liu trong mt DW thng n t nhiu ngun khc nhau. Do vy trc khi c a vo DW d liu cn phi c lm sch v m bo v cht lng. Vic lm sch s gip cho vic ng nht d liu tr nn d dng. Mt nguyn tc c t ra cho qa trnh ny l: Nu d liu c cng tn th bt buc phi ch n cng mt a ch. Nu d liu ch n cc thc th khc nhau th phi c t tn khc nhau. 1.5.3. Thch nghi vi s thay i DW cn phi c thit k x l nhng thay i c th xy ra. v thay i l diu khng th trnh khi cho bt c ng dng no. Ni vy c ngha l khi c thay i mi d liu c trong DW vn phi m bo tnh ng n. 1.5.4. H tr ra quyt nh y l mc tiu quan trng nht ca doanh nghip khi xy dng DW. Nhng ngi qun l doanh nghip mun da vo thng tin t a ra nhng chin lc gp phn em li kt qu kinh doanh tt nht. 1.5.5. Bo mt D liu trong DW n t nhiu ngun khc nhau. V vy vic m bo thng tin khng b l ra ngoi l mt iu v cng quan trng.

Trang -14127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse 1.6. Cc chc nng chnh:


1. 2. 3. 4. 5.

Phn h tch hp d liu Phn h phn tch d liu Phn h gim st h thng Phn h sao lu v phc hi h thng Phn h bo mt d liu 1.7.Li ch:

* i vi ngi khai thc:


o

Cung cp cng c phn tch, khai thc d liu nhanh gn, y v Gip ngi s dng khai thc d liu theo ch vi cc ngun v D liu c x l nhanh chng D dng to ra cc bo co n gin ph hp vi nhiu trnh khai thc H tr xy dng mt kho d liu ln Thit k mm do gip d dng tch hp d liu tc nghip mi v

chnh xc, d dng a ra cc chnh sch mi.


o

khong thi gian khc nhau


o o

* i vi ngi qun tr h thng:


o o

to ra cc bo co mi theo yu cu ngi khai thc. 1.8. c tnh ca kho d liu Kho d liu (DW) l mt tp hp d liu c tnh cht sau: *Tnh tch hp (Integration);D liu tp hp t nhiu ngun khc nhau. iu ny s dn n vic qu trnh tp hp phi thc hin vic lm sch, xp xp, rt gn d liu. *D liu gn thi gian v c tnh lch s. Cc d liu n t qu trnh kinh doanh ca cng ty c th c t nhiu nm trc. *D liu c tnh n nh (nonvolatility):: Khi mt Transaction hon chnh, d liu khng th to thm hay sa i.
*D liu khng bin ng *D liu tng hp D liu tng hp nhanh (lightly summarized data) l du hiu xc nhn cht lng ca mt kho d liu. Tt c cc yu t ca cng vic kinh doanh (phng ban, lnh vc hot ng, chc nng hot ng, ) c nhng yu cu thng tin khc nhau, v th Trang -15127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse


vic thit k kho d liu phi c kt qu cung cp d liu tu bin, tng hp nhanh cho mi yu t doanh nghip (xem thm phn kho d liu thng minh bn di). Mi yu t ca cng vic kinh doanh c th c truy cp n d liu chi tit v tng hp, nhng s khng c nhiu hn tng s d liu c lu tr trong chi tit hin hnh.

D liu tng hp cht lng cao (hightly summarized data) l cn bn cho vic tin hnh cng vic kinh doanh. D liu tng hp cht lng cao c th n t d liu tng hp nhanh c dng cho cc yu t cng vic kinh doanh hoc t chi tit hin hnh. S lng d liu mc ny c t hn cc mc khc, n m t mt tp hp c chn lc cung cp mt s s a dng rng ln cho cc nhu cu v cc s quan tm. Thm vo truy cp n d liu tng hp cht lng cao, vic tin hnh ni chung cng cn c kh nng tng mc cp nht chi tit thng qua tin trnh khoan i xung (drill down). 1.9.Cu trc d liu cho kho d liu V d liu trong kho d liu rt lp v khng c nhng thao tc nh sa i hay to mi nn n c ti u cho vic phn tch v bo co. Cc thao tc vi d liu ca kho d liu da trn c s l M hnh d liu a chiu ( multidimensional data model), c m hnh vo i tng gi l data cube. Data cube l ni trung tm ca vn cn phn tch, n bao gm mt hay nhiu tp d kin (fact) v cc d kin c to ra t nhiu chiu d kin khc nhau (dimention). V d: Mt thng k doanh s bn hng da trn ba yu t l: a im, thi gian v chng loi hng. Data cube l vn Thng k bn hng vi ba chiu l ba yu t: a im, thi gian v chng loi hng. Bng fact l bng tng hp d liu ca mi lin quan ca doanh s vi 3 yu t. trong SQL).

Trang -16127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse 1.10.Kin trc ca mt h thng kho d liu Kin trc kho d liu m t cc cu kin, cng c v dch v ca kho d liu, cng nh quan h v s pht trin ca chng. Mc ch ca vic chun ho kin trc kho d liu l tch hp cc h thng tin cp di phc v cc h thng tin cp trn v ngc li. Kin trc ny cung cp mt c ch t chc d liu, ci thin vic chia s thng tin gia cc c quan v v lu di c kh nng ti s dng d liu cng nh pht trin cc d n kho d liu tip theo c nhanh hn.

Hnh 3:Cu trc 3 lp ca kho d liu Bao gm ba tng : Tng y : L ni cung cp dch v ly d liu t nhiu ngun khc sau chun ha, lm sch v lu tr d liu tp tung. Tng gia : cung cp cc dch v thc hin cc thao tc vi kho d liu gi l dch v OLAP (OLAP server). C th ci t bng Relational OLAP, Multidimensional OLAP hay kt hp c hai m hnh trn Hybrid OLAP. Tng trn cng : ni cha cc cu truy vn, bo co, phn tch.

Trang -17127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse 1.11.Mi quan h gia kho d liu v khai ph d liu C hai u c th ng c lp vi nhau, tuy nhin khi kt hp c kho d liu vi khai ph d liu th li ch rt ln v cc l do nh : D liu ca kho d liu rt ph hp cho vic khai ph d liu (Data Mining) do c tp hp v lm sch. C s h tng ca kho d liu h tr rt tt cho cc vic nh xut, nhp cng nh cc thao tc c bn trn d liu. OLAP cung cp cc tp lnh rt hu hiu trong phn tch d liu. 1.12.Cc lnh vc ng dng C th a kho d liu vo ba hng ng dng chnh cn n tr tu kinh doanh (Business Intelligence): X l thng tin nh to ra cc bo co v tr li cc cu hi nh trc. Phn tch v tng hp d liu, kt qu c th hin bng cc bo co v bng biu. Dng cho cc d n c mc ch k hoch ho nh khai ph d liu.

Hnh 4: ng dng kiu Business Intelligence Cc lnh vc hin ti c ng dng kho d liu bao gm: Thng mi in t. K hoch ho ngun lc doanh nghip (ERP - Enterprise Resource Planning). Qun l quan h khch hng (CRM - Customer Relationship Management) Chm sc sc khe. Vin thng.
Trang -18127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse

Chng 2. CC YU T C BN CA KHO D LIU


2.1.Kiu ca d liu v cch s dng 2.1.1. Kiu ca d liu (Types of data) 2.1.1.1. ngha D liu c bn ca my tnh c s dng t lu vn hnh v qun l mt doanh nghip. D liu ny c gi l d liu cng vic (thng mi), c trng cho trng thi ca Doanh nghip. Mt kiu khc ca d liu l khi nim v tm quan trng ca d liu, gi tr ca d liu nm trong ni dung ca n hn l gi tr m n th hin. Kiu d liu ny c gi d liu mt sn phm, bi v n c sn xut, c mua, v c bn nh bt k mt sn phm vt l no. V d nh phim nh hoc sch c lu tr dng s. mc cui cng chnh l siu d liu(Metadata), n dng m t ngha ca d liu. Siu d liu ny ch c nh ngha hoc m t d liu cng vic hoc d liu nh mt sn phm. 2.1.1.2. Cu trc D liu c th c cu trc mc cao, bao gm nh ngha hon chnh lin quan n cc trng hoc cc bn ghi, hoc khng c cu trc, khi m cu trc ni b l rt bin ng, hoc n c th nm gia hai kiu trn. 2.1.1.3. Phm vi(Scope)

Hnh 5: Types of data and the scope of the warehouse


Trang -19127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse D liu c th l d liu c nhn, khi ch nhn ca n c th thay i n theo mun ca mnh, hoc cng cng - ni s dng ca n l chia s gia mt s ngi s dng v bt k thay i theo yu cu phi c qun l cn thn. 2.1.2. D liu cng vic (Business data) 2.1.2.1.nh ngha D liu cng vic l d liu c s dng trong cng vic kinh doanh v trong cng tc qun l ca cc doanh nghip hoc t chc. N th hin hot ng ca doanh nghip m nhn hoc cc i tng trong th gii thc nh: cc khch hng, cc v tr, cc sn phm, vi cc cch gii quyt ca n. D liu cng vic c to ra v s dng trong h thng x l chuyn tip v h thng h tr quyt nh. ( DSS) 2.1.2.2. Tiu chun cho kiu ca d liu cng vic: C 4 tiu chun c s dng xc nh cc kiu ca d liu cng vic. Bao gm: s dng chng trong Doanh nghip, phm vi ca d liu, c hay khng c/ghi hoc ch c d liu, v gi tr ca d liu. Gi tr s dng trong Doanh nghip D liu c s dng trong doanh nghip nhm t ti hai i tng sau: D liu vn hnh (Operational Data): c s dng vn hnh doanh nghip v c quan h ti cc hot ng v cc quyt nh. D liu thng tin c s dng qun l doanh nghip. Phm vi ca d liu D liu c th th hin mt thng tin n hoc mt giao dch, hoc n c th tng kt hiu qu ca tp cc thng tin hoc cc giao dch. - D liu chi tit (detailed data)hoc d liu nguyn t (atomic data) l mc tiu qun l doanh nghip, nhng n cng s dng trong mt s nhim v qun l doanh nghip n gin. N thng tp trung vo cc i tng c bn hoc giao dch c bn nh cc sn phm c nhn, cc yu cu, cc khch hng.

Trang -20127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse - D liu tng hp (Summary data) c s dng trong qun l v hin th tng quan cc cch vn hnh doanh nghip. La chn c/ghi hay ch c d liu - c/ghi d liu yu u thit k cn thn trong tin trnh cp nht v phi chc chn rng cc lut an ton cho doanh nghip phi c thc hin. - Ch c d liu: thng c thit k vi vic khng yu cu ghi li v cung cp c bn l c nhiu ln. Gi tr ca d liu: D liu hin ti (current data): l mt cch nhn v thng mi trong thi im hin ti. N t ti mc th hai v l i tng c th thay i theo thi gian da trn cc hot ng thng mi. N th hin biu din chnh xc ca s thc hin hin ti ca doanh nghip. D liu thi im (Point-in-time data): l s n nh ngn ca d liu cng vic ti mt thi im hin ti v phn nh trng thi ca cng vic ti thi im hin ti. D liu cng vic hng ngy v tp d liu hng thng, d liu ny c th th hin trong qu kh hoc d on, th hin k hoch hoc cc s kin d on trong tng lai. D liu nh k (periodic data) l lp d liu tng lai quan trng. N cung cp bn ghi nh ngha ca cng vic nh cc thay i chu k theo thi gian. Cc nh k ca thi gian c rt nhiu chu k, nhng chu k thi gian bao trm mt s nm c quan tm trong DW. 2.1.2.3. Ba kiu ca d liu cng vic: D liu thi gian thc (Real time data): l d liu hin thi hoc d liu n mc th 2 biu din trng thi hin ti ca d liu cng vic v c s dng trong Doanh nghip. N xut hin ti mc chi tit v c truy cp trong ch c/ghi. D liu thi gian thc l d liu c to ra, c vn dng v s dng bi cc thao tc hoc cc ng dng sn xut. D liu ny c bn c ly ra t cc file hoc c s d liu trong mi trng my tnh ln. V c kim sot v qun l bi b phn h thng thng tin.
Trang -21127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse D liu thi gian thc khng b hn ch trong my tnh ln hay cc ng dng k tha. Mt m hnh mi ca ng dng client/server to ra d liu thi gian thc trong cc trm lm vic v cc my ch. D liu thi gian thc ny c phn b thng qua hot ng kinh doanh v him khi di s kim sot trc tip ca b phn h thng thng tin. Hn na, d liu thi gian thc c ngun gc bn ngoi doanh nghip. N xut hin khi x l thng tin hot ng kinh doanh, chng hn nh cc n t hng hoc cc ho n thanh ton, gia cc t chc gia cc t chc trao i d liu in t (EDI), v cc d liu vo c s dng c bn cho cc hot ng ca cng ty nhn c. Data Customer file Industry All Usage Track customer details Account balance Finance Control account activities, e. g., witharawals Point of sale data Retail Generate bills manage stock Client/server, relational database, UNIX system Call record Telecommu n- ications Billing Legacy application, hier archical database, mainframe Production record Manufacturing Control production New application, relational database, AS/400 Hnh.6: V d ca thi gian thc
Trang -22127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Technology Legacy application, flat files, mainframe

Volumes Small medium

Legacy application, hier archical database, mainframe

Large

Very lange

`Very lange

Medium

Tm hiu v Data Warehouse D liu ngun (Derived data): D liu ngun l d liu n gin c to ra, thng qua mt s x l, t d liu thi gian thc. N c s dng qun l doanh nghip, trong ch ch c, hn l cc vn hnh hng ngy ca doanh nghip. N c th t n mc chi tit hoc mc tng hp. Bi v n nhn t d liu thi gian thc, n thm ch l thi im trong thc t, th hin quan st ca doanh nghip ti thi im , hoc nh k trong thc t, bo ton lch s bn ghi ca doanh nghip qua k thi gian. D liu ngun l tp cc d liu truyn thng c s dng h tr quyt nh. N c pht hin thng qua t chc ngy nay, t cc c s d liu quan h trong cc my tnh ln, cho cc gi bng d liu chuyn dng trong cc my tnh c nhn, v nhiu th trong . Mc d quan nim l d liu ngun c th c cp nht t ng, trong mt s trng hp vic x l c lm th cng, vi cc ni dung ca cc bo co c in ra c g li vo cc cng c qun l thng tin. D liu iu chnh (Reconciled data): D liu iu chnh c sinh ra bng mt x l thit k m bo tnh thng nht ni b ca d liu kt qu. Qu trnh ny c vn hnh trong d liu thi gian thc mc chi tit. Hng th hai ca x l sinh ra l duy tr n hoc to ra tp lch s ca d liu. D liu iu chnh c xem nh l loi c bit ca d liu ngun. Trong cc mi trng h tr quyt nh truyn thng, d liu iu chnh l him khi c xc nh r rng. Trong nhiu trng hp, n khng tn ti. Trng hp khng tn ti, n him khi c lu tr vt l, ch l kt qu hp l ca mt s hot ng din ra trong qu trnh tnh ton. Trong trng hp khc, n ch tn ti trong cc tp tin tm thi. Nh th th khng cng nhn l c bt k kt qu kinh doanh. Trong thc t, i chiu d liu l yu t then cht ca kho d liu. L mt kt qu ca vic s dng tip cn pht trin ngun ng dng, d liu thi gian thc khng phi l t nht qun trn ton b phm vi ca doanh nghip. iu ny to ra iu chnh d liu l cn thit.
Trang -23127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse V vy, bt c khi no d liu t nhiu ngun c kt hp, pht trin u tin phi phn tch cu trc v ni dung ca cc ngun xc nh cc quy tc kt hp. Sau , h cn phi pht trin mt qu trnh thc thi cc quy tc ny. Thng thng, qu trnh bao gm cc chc nng nh ni v thao tc ca cc trng, s thay i ca cc trng d liu theo cc mu ph hp, v trong nhng tnh hung cui cng, cc loi sa cha li. 2.1.3. Siu d liu(Meta data) Mt trong nhng phn quan trng nht ca kho d liu l kho d liu v d liu (metadata) d liu qun l d liu. 2.1.3.1.Khi nim Metadata l ton b tt c cc mc ca kho d liu, k c cc dng tn ti v cc chc nng mt chiu khc bit ca kho d liu khc. Hay ni mt cch khc th Meta data l dng d liu miu t d liu. Trong c s d liu, Metadata l cc dng biu din khc nhau ca cc i tng trong c s d liu Trong c s d liu quan h th Metadata l cc nh ngha ca bng, ct, view, v nhiu i tng khc. Cn Trong kho d liu Metadata l dng nh ngha ca d liu nh bng, ct, mt bo co, cc lut doanh nghip hay nhng quy tc bin i. Metadata bao qut tt c cc phng din ca kho d liu. 2.1.3.2. Mc ch Cc chuyn vin pht trin kho d liu s dng Metadata qun tr, iu khin s hnh thnh v duy tr s tn ti cc kho d liu nm bn ngoi kho d liu ni trn. Metadata ca ngi s dng kho d liu l mt phn ca chnh kho d liu v c th c dng iu khin s phn tch v truy cp kho d liu . i vi ngi s dng kho d liu, Metadata ging nh l mt t mc lc (card catalog) v cc ch c trong kho d liu.

Trang -24127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse 2.1.3.3. Metadata phi cha cc thng tin: - Cu trc ca d liu - Thut ton s dng tng hp d liu - nh x xc nh s tng ng d liu t mi trng tc nghip sang kho d liu 2.1.3.4. Tc dng ca metadata Metadata l d liu m t d liu. v vy khi d liu c cung cp cho ngi dng cui, Metadata s cung cp nhng thng tin cho php ngi dng hiu r hn bn cht d liu m h ang c. Nhng thng tin ny s gip cho ngi dng c c nhng quyt nh s dng ng n v ph hp v d liu m h ang c. Tu thuc vo tng mc ch s dng khc nhau, tng loi d liu khc nhau m cu trc v ni dung d liu Metadata c th c nhng s khc bit. Trong bao gm mt s loi thng tin: - Thng tin m t v bn thn d liu Metadata - Thng tin v d liu m Metadata m t - Thng tin v c nhn, t chc c lin quan n d liu Metadata v d liu 2.1.3.5. Tiu chun cho cc kiu siu d liu Tng t nh d liu cng vic, metadata c phn lp theo mt s tiu chun c bn. C hai tiu chun c bn: khi n s dng trong vng i ng dng v khi n c s dng tch cc hoc b ng. a). Mi lin h ti vng i ng dng: Vic s dng siu d liu trong qu trnh xc nh v xy dng ng dng doanh nghip v c s d liu lin quan ca h khc vi vic s dng n trong cc ng dng v c s d liu trong sn xut. N c phn bit gia: - Siu d liu thi gian xy dng (Build- time metadata): thit k thun li cho vic s dng, cng nh ti s dng c d liu v chc nng bi nhng ngi thit k ng dng v c s d liu.

Trang -25127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse - Siu d liu thi gian sn xut (Production - time metadata): c thit k thun li cho vic tm kim, s hiu bit, v s dng cc d liu cn thit trong cng vic. b). S dng ch ng hoc th ng: c tnh ny m t k thut s dng to ra siu d liu thi gian sn xut: - Siu d liu c s dng iu khin hnh ng hoc chc nng ca mt s ng dng hoc phn khc ca phn mm c vai tr tch cc. - Siu d liu c s dng trong ch tm kim, thng l mt ngi, tm mt s d liu cng vic hoc hiu mt s c tnh ca d liu cng vic ang c s dng trong mt ch th ng. 2.1.3.6. Ba loi siu d liu a). Siu d liu thi gian sn xut (Build time metadata): Ngun gc ca siu d liu c s dng trong kho l qu trnh m theo cc ng dng kinh doanh v cc d liu c m t v nh ngha. Siu d liu c to ra v c s dng trong giai on ny l siu d liu thi gian sn xut. Theo nh ngha ca phm vi kho d liu, siu d liu thi gian sn xut l bn ngoi phm vi kho. Tuy nhin, nh i vi d liu cng vic thi gian thc, siu d liu thi gian sn xut khng th b qua bi v n l ngun gc ca cc siu d liu m khng thuc phm vi ca kho. Ngy nay, siu d liu thi gian sn xut c to ra v lu trong m hnh d liu v cc cng c thit k ng dng nh CASE tools. Theo yu cu, cc ng dng tn ti, siu d liu thi gian sn xut thng tn ti hon ton ch trong c s d liu hoc cc thit k file ca ng dng hoc trong thit k hoc ti liu ngi dng. Siu d liu thi gian sn phm l n nh so vi cc d liu cng vic n m t. Ni chung, siu d liu thay i ch khi cu trc tng th ca doanh nghip hoc thc hin ca chng trong cc ng dng thay i. Siu d liu c nh ngha trong vic thit k ca mt ng dng s khng thay i t vic phin bn u tin ca ng dng cho n khi mt phin bn cui cng, v vn tn ti n khi phin bn c nng cp.
Trang -26127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse b). Siu d liu iu khin: Siu d liu iu khin c s dng tch cc bi cc thnh phn kho nh mt c ch qun l v kim sot hot ng ca cc thnh phn ring ca n. Do , n l mt phn ca siu d liu thi gian sn xut. N c hai ngun. - Thng tin cu trc vt l chi tit c ngun gc t vic xy dng siu d liu thi gian xy dng. Bi v n c thit k s dng cho cc thnh phn kho, siu d liu ny l khng ph hp cho ngi dng cui. - Ngun th hai l cc thnh phn kho ca n. Nh siu d liu m t nhng hot ng ang xy ra m siu d liu l i tng. Siu d lieuj l quan trng vi c ngi dng cui v ngi qun tr trong kho d liu. C hai kiu: Siu d liu tin t ( currency metadata): siu d liu tin t m t cc thng tin thc t v tin t hoc tnh thi im ca cc d liu cng vic. V d nh thi gian cp nht cui cng ca mt bng trong mt c s d liu, hoc ln u tin mt ng dng c bit chy trn bt c ngy no. Thng tin ny c th c cung cp ch bi cng c hay ng dng cung cp cho d liu cng vic hoc chy mt ng dng. Siu d liu tn dng (Utilization metadata): Siu d liu tn dng l lin quan ti an ton v tnh nng cho php s dng kim sot truy cp vo kho. Ngoi ra, siu d ny liu cung cp iu kin truy vt d liu hoc cc chc nng c s dng trong kho, v v th cho vic nh gi tnh hu dng ca n hoc gi tr cho ngi dng cui. c). Siu d liu s dng (Usage metadata ): Siu d liu s dng l siu d liu quan trng nht cho ngi s dng d liu cng vic, c bit l trong mi trng thng tin. y l ni ngi dng cui t c li ch kinh doanh v h thng thng tin nhn s t c nhng ci thin v nng sut. Siu d liu s dng bt ngun t siu d liu thi gian sn xut v tng t trong ni dung. S khc bit nm trong cch siu d liu ti mc ny cn c cu trc theo kh nng ca cc ngi dng tm kim hiu qu v

Trang -27127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse khai thc n. Cu trc yu cu bi ngi dng cui v tn hiu khc cn thit t nhng ngi thit k ng dng v c s d liu. Siu d liu s dng m t bi cc kha cnh sau ca d liu hoc ng dng: - iu kin ca doanh nghip: Loi siu d liu ny m t hot ng ca doanh nghip trong hnh thc hoc cch cu trc. c tnh ny cho php cc ngi dng lin kt cc phn t d liu hoc chc nng ca ng dng cho mc ch ca h trong kinh doanh. Khi iu kin ca d liu v ng dng c bit, ngi dng c th lin kt chng li trong kinh doanh thc, v h thng thng tin c nhn v kt ni ngi dng c th kt ni nh nhau. - Ch s hu v cng v qun l: Ch s hu buc mi quan h gia d liu hoc ng dng v t chc, v ch r ngi c trch nhim vi kha cnh ring bit v duy tr chng. Ch s hu c th c phn chia, v d mt ngi c trch nhim v chnh xc ca file d liu, trong khi ngi khc nhn trch nhim v tnh a dng thi gian. Ch s hu d liu c th phn chia thc hin cc quyt nh cng vic. Trong trng hp ny, chc nng ph tr ca ngi qun l d liu c nh ngha l ch ra trch nhim thng xuyn vi d liu. Trong mi trng kho, ch s hu d liu l quan trng hn ch s hu ca chc nng ng dng, nhng ch s hu d liu tri ngc l kh xc nh s phn chia. Khi n c nh ngha, v lu vt, ngi dng cui c th ly trch nhim cho cht lng ca d liu. - Cu trc d liu Cu trc ca siu d liu m t k thut sp xp ca d liu. C mt s kiu khc nhau ca cu trc cn cho vic lu tr. V d, mt phn t d liu c th c m t di dng ni n lu tr vt l, ci m cu trc d liu c s dng, khi n l k t hoc s, kch thc ca n l bao nhiu v ng dng no qun l n. - Cc kha cnh ng dng

Trang -28127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse Siu d liu phi bao gm m t cc chc nng ca ng dng, ngn ng m n c vit, d liu m n s dng v kt qu, v cc iu kin tin quyt no, v nu cn l cc yu cu khi s dng n. Trong ng cnh ny, ngi dng cui c th s dng trc tip cc ng dng hoc h chu trch nhim v s thc hin ca cc d liu trong kho. 2.1.4. D liu vt qu phm vi ca kho d liu (Data beyond the

scope of the Data Warehouse) 2.1.4.1.D liu ging nh mt sn phm(Data as a product) Mt s su tm nhm, thao tc, hoc thng tin sn xut di dng in t ang tng ln nhanh chng v tm quan trng v gi tr nhng khng thuc phm vi ca kho d liu nh c nh ngha, v thc s n nm bn ngoi phm vi ca h thng x l d liu truyn thng. D liu l mt sn phm c to ra v c lu tr, n khng phi l mt phng tin chy hoc qun l mt doanh nghip. N l mt sn phm ca mt hot ng doanh nghip, c th c mua v bn, v phi c qun l v kim sot nh bt k mt sn phm vt l. V d, gi tr ca mt quyn sch l d liu thng tin ca n. Nh mt sn phm, n c sn xut trn giy. Tuy nhin, phn ln cc tin trnh sn xut ca n tn ti dng nguyn bn v d liu nh nm trong mt my tnh. D liu l mt sn phm nm ngoi phm vi ca d liu nh c nh ngha. Tuy nhin, cc cng c v k thut c s dng xy dng v qun l mt kho d liu cng c th c s dng trong mt cch tng t xy dng v qun l d liu nh l mt sn phm. 2.1.4.2. D liu cng vic c nhn v siu d liu D liu c nhn c nh ngha n gin l d liu nm di s kim sot ca mt c nhn duy nht. l to ra, s dng, v xa bng theo yu cu ca qu trnh kinh doanh m ngi chu trch nhim. Nhng d liu ny lun lun tn ti, t nhn vin bn hng vit vi ghi ch v mt trt t cc iu hnh c cha tn, a ch, v ngy sinh ca a ch lin lc ca khch hng; t vit tay ca d bo doanh s bn hng nm bn cnh lm danh
Trang -29127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse mc cc nhim v vo ngy mai,... Khi s dng my tnh ln, rt nhiu d liu c lu tr trong bng tnh, qun l thng tin c nhn, vv Trc nm 1990, d liu c nhn c tm quan trng hn ch trong h thng thng tin. N tn ti trong cc h thng thng tin ca cc ca hng. Tuy nhin, khi lng ca n kh hn ch, v tng i c lp vi dng chnh ca cc d liu cng vic. T n nay c s thay i ng k c hai yu t ny. Ngi s dng cui hin nay lu tr d liu trn my tnh c nhn vi hng trm GB. Nhng ci thin trong mng LAN v client/Server, mng Internet, cng ngh dn n s gia tng ln s trao i d liu gia cc my tnh v cc cng ty trong mi trng h thng thng tin. D liu c nhn c lin kt trong mng li, c th d dng chia s n. 2.1.5. D liu bn trong v bn ngoi (Internal and external data) Trc y, phn ln cc d liu c ch cho mt t chc u c ngun gc trong t chc . Thm ch khi d liu nm bn ngoi, s lng ca cc ngun nh, khi lng ca d liu t m nh hng ca d liu bn ngoi vo kin trc tng th l tng i quan trng. iu ny l khng cn gi tr. V d, n c bo co rng hin nay c hn 10. 000 ngi tiu dng cc ngun d liu trc tuyn Hoa K, bao gm 1.500 bin v 150 t ngi. S tng trng bt thng ca Internet trong nhng nm qua cng gy ra mt s tng trng theo hm m trong cc khi d liu in t vo, ra tt c cc t chc. Trong phm vi qui nh ca kho d liu, s tng tc bn trong hay bn ngoi u cn phi c xem xt. Trong gm c: D liu cng vic c cu trc: d dng c th tng hp d liu ni b hin ti, d liu c cu trc bn ngoi phi c x l th cng. D liu phi tri qua mt qu trnh hp nht vi cc d liu trong bo m tnh thng nht ca n vi d liu ni b hin ti. iu ny ng rng cc siu d liu lin quan bn ngoi cng phi c to sn cho vic thu nhn vo.

Trang -30127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse Vi d liu cng vic ra bn ngoi cu trc, cc siu d liu lin quan cng phi c lm sn c. Trong trng hp ny, yu cu v trch nhim php l c th pht sinh t vic cung cp d liu khng chnh xc. - D liu cng vic khng c cu trc: tng t p dng cho d liu cng vic phi cu trc. Tuy nhin, v c kh khn hn d liu phi cu trc t ng nhng trong qu trnh ra quyt nh. - D liu l mt sn phm: D liu bn ngoi nh l mt sn phm vo kho d liu nh d liu cng vic. - Siu d liu: Siu d liu t khi loi b hoc a vo t chc. Thay vo , n i km vi d liu cng vic trn ranh gii ca t chc. Vic ny l cn thit cho php cc d liu cng vic c hiu v hp nht theo yu cu.

Hnh 7: Relationships between internal and external data 2.1.6. Kt lun: Rt kh xc nh phm vi ca kho d liu. c bit ng cho s ph bin ca cc i tng v n lc ca cc nh cung cp mang li li ch bng cch lin tc m rng phm vi bao gm cng nhiu cc dng sn
Trang -31127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse phm ca h cng tt. Phn ny trnh by v xc nh phm vi ca kho d liu v cc loi d liu m n h tr. Tuy nhin d liu c chia ra, trn c s s dng ca n, trong d liu doanh nghip v siu d liu c bao gm trong cc kho v d liu c coi nh mt sn phm. 2.2. Khi nim kin trc d liu(Conceptual data architecture): Mt trong nhng bc u tin trong vic thit k bt k h thng x l d liu l thit lp mt kin trc tng th cho h thng v t c s chp nhn rng ri cc kin trc . Vic thit k ca mt kho d liu cng vy. Theo truyn thng, vic thit k cc h thng hot ng bt u vi kin trc ng dng. Kt qu t ni cc ng dng hot ng vi cc chc nng m ngi dng yu cu. Cch tip cn ny c h tr bi cc phm vi d liu tng i hp nh a ch cc ng dng. Tuy nhin, do tm quan trng trng ca s gn kt d liu trong kho d liu, c d liu cng vic v siu d liu phi l im khi u trong kin trc ca kho. y xem xt ba kin trc d liu cho d liu cng vic. Mi kin trc u c li th v bt li ring ca n. C cc tiu ch quan trng nh gi chng nh: s linh hot m d liu c th c truy cp v s dng cho ngi dng cui; qun l cht lng d liu cho h thng thng tin c nhn v mt s yu t khc trong cc tnh hung c th. Tuy nhin, khng c kin trc duy nht l ph hp nht vi mi tnh hung, mt tip cn ring c th s thnh cng trong phn ln cc trng hp. i vi siu d liu th n gin hn. Mt kin trc d liu duy nht h tr c ba la chn thay th ca kin trc d liu cng vic. 2.2.1. Cc kin trc d liu cng vic (Business data architectures) Ba m hnh kin trc c m t trong cc phn sau y c mt im chung l u da trn thc t kinh nghim. Trong ba kin trc c t tn theo s lp ca d liu bao quanh chng. Cc lp d liu ny l khi nim ha hn l vt l. V vy, trong bt k thc hin no, mt lp c th c xc nh bi cc loi d liu ca n, ch khng phi bi v tr vt l ca n.
Trang -32127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse 2.2.2. Kin trc n lp d liu (The single-layer data architecture) Nguyn tc c bn quan trng trong kin trc n lp l bt k yu t d liu no ch c lu tr mt ln v mt ln duy nht. Trong khi mc tiu ny c kh khn hoc khng th t c, cu trc ca kin trc ny cho php c th t c mc tiu ny. Trong mt kin trc n lp, khng phn bit s to ra gia bt k cc loi d liu c m t trc, tt c d liu c coi nh nhau. Mc d khng c s m t chnh xc cht ch, kin trc ny ch yu cp n tt c d liu thc s c th tn ti trong thi gian thc. D liu xut pht c th tn ti trong phm vi kin trc ny, nhng n khng c xem xt bt k khc bit t cc d liu thi gian thc t ngun gc ca n.

H Hnh 8 :The single layer data architecture Sc mnh ca kin trc n xut pht t mc tiu lu tr mi phn t d liu. Bi v n ti thiu cc yu cu lu tr d liu v cho ngn chn vn sao chp d liu trong ng b ha. im yu ca tip cn ny l s bt ng xut hin gia s vn hnh v cc ng dng thng tin, dn n vic d
Trang -33127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse liu khng sn sng cho cc ng dng hoc thi gian phn hi chm cho cc thao tc ng dng. im yu na l n khng cung cp s tr gip trong vic lm th no d liu c phn loi c th thc hin c hoc lm th no ngi dng cc v tr a l khc nhau c th truy cp c d liu ca cng ty. 2.2.3. Kin trc hai lp d liu (The two-layer data architecture) y l mt ci tin cho kin trc lp n vi hai cch s dng d liu khc nhau - hot ng v thng tin, v phn chia d liu thnh hai lp (trong hnh v). Lp thp hn, c s dng bi cc ng dng vn hnh ch c/ghi, y l d liu thi gian thc. Lp trn, s dng bi cc ng dng thng tin, l d liu ngun. D liu ngun c th n gin nh mt bn sao trc tip ca cc d liu thi gian thc, hoc n c th c bt ngun t d liu thi gian thc bng mt s tnh ton. Cch tip cn ny ngay lp tc gii quyt mt trong nhng vn chnh ca kin trc lp n - gia hai loi d liu s dng khi vn hnh trn mt ngun d liu n. Li ch th hai l nhng ngi dng cui c a ch r rng cn thit cho d liu khc nhau c lu tr nh d liu thi gian thc.

Hnh 9 :The two layer data architecture

Trang -34127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse Tuy nhin mt trong nhng vn kin trc ny l mc cao ca s nhn i d liu, trong lp d liu ngun. Vic nhn i ny dn n s bng n trong lu tr d liu, nhng quan trng l vn qun l d liu v cc vn qun tr. 2.2.4. Kin trc ba lp d liu (The three-layer data architecture) Kin trc ba tng l s chuyn i ca d liu thi gian thc v d liu ngun thm mt bc so vi kin trc hai tng. N bao gm: 1. iu chnh d liu t cc tp hp d liu a dng trong lp thi gian thc. 2. Ngun cc d liu cn thit cho ngi s dng t cc d liu c iu chnh. iu ny dn n cc kin trc c m t trong hnh

Hnh 10: The three-layer data architecture Trong phng php ny, lp thp nht l d liu thi gian thc, lp trn cng l d liu ngun, v cc lp gia l d liu iu chnh. S iu chnh d liu gia cc tp d liu khc nhau trong cc yu cu d liu thi gian thc gia cc b khc nhau ca d liu trong thi gian thc yu cu s
Trang -35127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse hiu bit v cch cc tp hp d liu lin quan n nhau, v vai tr ca chng trong cng vic. Trong thc t, s hiu bit ny c xc nh thng qua qu trnh m hnh ha d liu. Mi quan h gia cc lp d liu iu chnh v m hnh d liu doanh nghip l quan trng nm c cc cng vic ca kin trc ba lp. Chng ta c th hiu khi nim bng cch xem xt lm th no ngi ta c th hp l ho cc d liu t bt k hai ng dng hin c v nhng kt qu s c. V d v s iu chnh, gi nh rng mt ng dng qun l n hng qun l mt c s d liu bao gm mt tp tin khch hng, tp tin mt sn phm, v bng mt n t hng v bng mt ho n. Mt ng dng qun l mt c s d liu ho n c cha mt bng khch hng v bng mt ho n. Khi d liu t hai h thng c yu cu trong lnh vc qun l thng tin, cc phn ca d liu ny phi c tng hp v hp l ha. Cc tp tin ca khch hng t h thng yu cu v bng khch hng t h thng lp ho n phi c kt hp to thnh mt bng khch hng duy nht trong kho. V vy, mt thc th khch hng tng qut hn phi c xc nh, p ng nhu cu ca c hai lnh vc kinh doanh.

Hnh 11:An example of reconciliation

Trang -36127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse Hn na, trong mi trng thng tin qun l, d liu t cc vng ng dng ny phi c lin kt vi cc d liu khc theo d tnh ban u trong cc ng dng vn hnh. V d, c th cn phi phn tch lm th no ho n lin quan n cc n t hng ca khch hng ban u tm thy nhng g t l phn trm n t hng trong mt chuyn.

Hnh 12: reconciliation and derivation in the ther layers

Trang -37127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse Chng 3.

GII THIU KIN TRC LOGIC KHO D LIU


3.1. D liu cng vic trong kho d liu (Business data in the data warehouse) 3.1.1. Cc h thng vn hnh (Operational systems) Cc h thng vn hnh l cc ng dng c s dng thc hin cc cc cng vic, v cc d liu m h s dng, trong cc tp tin v c s d liu l cc d liu thi gian thc. Ngy nay, cc ng dng nh vy tn ti vi nhiu nh dng v v tr, chng dng hn tp v c phn b theo kiu no . Cc ng dng mi c xy dng c thc hin trong cc mi trng client/Server. Cc h thng hot ng thng c k tha, nhng chng khc nhau mt kha cnh quan trng. Cc h thng k tha thng gm cc chc nng bo co, c s dng qun l cng vic. y ch l mt phn nh ca ng dng c k tha, c phn bit vi cc chc nng vn hnh. V tr thch hp ca n l lp ngun. V cc h thng vn hnh tng tc vi nhau, thng qua d liu v sa i n khi cn thit, n lun lun cn thit xc nh chnh xc v sm nht mt cch c th ngun gc chnh xc ca bt k mc d liu c th trong kho. M hnh d liu c bit, s phn tch cc d liu tn ti trong bi cnh ca m hnh d liu doanh nghip c mt vai tr quan trng y. 3.1.2. Kho d liu cng vic (The business data warehouse) Kho d liu cng vic (BDW) l s thc hin vt l ca lp d liu iu chnh. Cc c tnh ca lp d liu iu chnh c m t gm: Chi tit ( Detailed) Lch s (historical) Ph hp (consistent) M hnh ha (modeled) Chun ha ( normalized)
Trang -38127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse Mt BDW c thc hin trong mi trng quan h, l mi trng tt nht m hnh ha v chun ha t nhin. Trong l thuyt ny, BDW c th c phn chia, cc yu cu x l iu chnh cho mt lng ln d liu c ni v c quan tm, tin trnh thch hp vi thc hin khng phn chia. Cc kha cnh t chc cng iu khin BDW hng ti thc hin tp trung. Bi v BDW c d kin l mt im iu khin, ni m cht lng v tnh an ton ca d liu c m bo trc s to ra kh nng m rng cc thnh vin ngi dng cui ca n. Tnh an ton ca BDW l kha cnh quan trng, v n bao gm tt c cc mu d liu c tch hp. An ton vt l cng m bo mt tip cn lu tr tp trung s hu ca cng ty. Vic a ra kch thc ln ca BDW cng l kt qu ca lch s t nhin ca n ch cc thnh phn ca n, c th trc tuyn vt l ti bt k thi im no. BDW l rt thng thng, c s dng trc tip bi ngi dng. Hn na, n l ngun cho tt c cc d liu trong cc kho thng tin cng vic. Do , d thc hin cho BDW tp trung xung quanh mt lng ln khng trc tuyn hoc tin trnh x l theo khi ca s b tr ca n t h thng vn hnh v s trch rt d liu t dng s dng. 3.1.3. Cc kho thng tin cng vic ( Business information warehouses -BIW) Mt kho thng tin cng vic l tn gi thng thng cho bt k h thng s dng bo co, phn tch hoc d on cng vic. N bao gm bo co thng tin qun l, h tr quyt nh, v cc h thng thng tin thc hin tng t nh cc h thng phn tch tip th, cc ng dng khai ph d liu,.. Mi trng ny c phn chia mc cao, c th thy trong m hnh client/server v cc thc hin da trn workstation. Trong khi m hnh ny c th tip tc c phn chia mc cao, n km ng nht hn lp d liu thi gian thc. Phn ln cc BIW tn ti tron cu trc quan h da trn hng v ct. Mi trng quan h bao gm c s d liu truyn thng nh bng tnh v cc cng c phn tch a chiu.
Trang -39127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse Cc BIW bao gm d liu ngun, c nh ngha h tr cho cc yu cu doanh nghip v ngi dng cui. Chng c th bao gm d liu chi tit hoc d liu tng hp mc cao, d liu d on qua lch s thi gian, hoc ngn hn. Cu trc ca cc BIW l ph hp cho cc thc hin truy vn trc tuyn, thm ch khng d tnh trc hoc nh ngha trc. C hai kiu ca BIW l staging BIWs l tc gi ban u ca BIW v user BIWs (khng phi l tc gi). Staging BIWs yu cu qun l c bit chc chn tha mn tnh n nh v ton vn ca d liu lu tr trong . 3.2.Cc vn khc ca d liu cng vic (Business data - other considerations) 3.2.1 Cc nhu cu d liu c bit (Special data needs) - Cc sa cha (Corrections): Khi ngi dng cui cng pht hin ra sai st ca thc t trong kho thng tin cng vic, h thng s dng d liu ring ca h v mong mun cc sa cha ny c kt qu tr li vo d liu ngun m bo mt ci nhn nht qun ca cng vic. Cc sa cha l cn thit trong cc h thng vn hnh, kho d liu cng vic, v cc kho d liu thng tin cng vic. - Cc iu chnh (Adjustments): Tng t nh hiu lc sa cha, cc iu chnh phn nh mt s thay i trong phn loi ca d liu trong cng vic do hon cnh thay i. Cc d liu c lm chnh xc ban u, nhng sau ngi dng cn phi s dng hoc phn tch n mt cch khc nhau. iu ny dn n s cn thit phi thay i d liu trong kho d liu cng vic v c th i khi cng nh hng n cc h thng hot ng. - Ti s dng d liu (Data reuse): D liu ngun ban u c th tr thnh d liu vo cho qu trnh vn hnh. V d, trong phn tch cc mu khch mua hng, ngi dng cui cng (nh cc qun l bn hng) c th yu cu tng hp cc phn lp khch hng c bn. Cc loi ny mi c to ra nh l mt phn ca qu trnh ngun, v c lu tr trong cc kho thng tin cn vic. Cc loi d liu nh c s dng lm c s cho mt h thng

Trang -40127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse nhim v mi cho lc lng bn hng. y l qu trnh vn hnh yu cu d liu t cc kho thng tin cng vic. - D liu d on (Predictive data): D liu c s dng d bo xu hng v thit lp cc trng thi vn hnh trong tng lai bt u t mt kho thng tin cng vic v s dng thit lp d liu trong cc h thng vn hnh. V d, mt phn tch v gi vt liu th trong lp d liu ngun cho php tnh ton ra gi bn mi, n c th l u vo cho cc h thng vn hnh. 3.2.2. Nhn t c bn cho lung d liu duy nht ( The rationate for uniditrecional data flow) Nhn t c bn cho mt lung d liu duy nht da trn nh ngha c bn ca cc loi d liu v bt ngun t nguyn tc qun l c s d liu. N c cng nhn rng ri rng d liu phi c to ra v duy tr trong mi trng kim sot v qun l cn thn, n c th c xc minh v xc nhn trong d liu vo thng qua mt tp c nh ngha thng nht v th tc kim tra u vo. Hot ng h thng phi p ng iu kin . 3.2.3. H tr "i chiu" cc lung d liu (Supporting " reverse " data flows): Gii php cho mi s cn thit da trn s c tha nhn, trong mi trng hp, d liu mi ang c to ra. Thc t d liu ny mi c gn vi d liu hin c. Nguyn tc l d liu mi c to ra v duy tr trong lp d liu thi gian thc bng cc h thng vn hnh. V trch nhim quan trng ca cc h thng vn hnh l xc minh v xc nhn cc d liu m h nhn c t bt c ngun no. 3. 2. 4. D liu c nhn (Personal data ) D liu c nhn phn ln ri bn ngoi phm vi ca kho d liu. y l mt kt qu ca mc kim sot v qun l c th c thc hin trn cc d liu trong vic so snh vi cc d liu chung. Tuy nhin, khi d liu c nhn khng thuc phm vi ca cc kho d liu, v tr ca n trong kin trc phi c xc nh.

Trang -41127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse Kin trc ba lp cho php d liu c nhn tn ti trong cc lp d liu thi gian thc v c lp d liu ngun. cp khi nim, khng c s phn bit gia cc d liu c nhn v d liu chung trong hai lp ny. d liu c nhn c th c tp trung hay phn tn. N c th c tng thch vi cc d liu chung hoc bt ngun t n. D liu c nhn khng tn ti trong cc lp d liu tng thch, bi v lp ny l i din duy nht hp l ca m hnh d liu doanh nghip (EDM), v do l i lp vi d liu c nhn. cp logic, s khc bit gia d liu chung v d liu c nhn l s cn thit trong lp d liu ngun. 3.3. D liu bn ngoi. 3.3.1. Thng tin qun l bn ngoi( Exteral management information): Phn tch v nm c s thc hin ca mt cng ty yu cu truy cp n cc d liu vn hnh tng hp ca cng ty mt cch c cu trc. Tuy nhin, hiu trin vng ca cng ty trn th trng v ln k hoch thnh cng cho tng lai, vic truy cp d liu t th trng ni chung l mt yu cu mnh m. Kt qu l, cc nh hoch nh chin lc v cc nh qun l hnh thng cn mt s lng ng k d liu bn ngoi. Trc y, d liu ngoi ny c thu thp khng theo mu v cha t ng. V vy, hiu qu b b qua bi cc h thng thng tin. Vi s qun l s dng my tnh v mng Internet ngy cng tng ca cc gim c iu hnh, v s c sn rng ri cc d liu ngoi, d liu bn ngoi tr thnh mt xem xt quan trng trong kho. Vi s ph bin ca mng Internet ngy nay ang gy ra mt s tng trng bng n trong s lng v loi thng tin. Hn na, s lin quan ti chnh trong cc quyt nh trong cc t chc c ngha quan trng. Khi quyt nh c da trn thng tin ngoi, d liu ngoi c th c a vo trong cng vic c qun l v iu khin mt cch cn thn nh d liu trong.

Trang -42127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse 3.3.2. Trao i d liu in t (Electronic data interchange - EDI): S gia tng chuyn giao d liu gia cc phng tin thng thng khc l trao i d liu in t (EDI). EDI ch yu l mt qu trnh hot ng v l phng tin cc ng dng vn hnh trong hai cng ty trao i thng tin. Cc loi d liu c lin quan l d liu thi gian thc. Hnh 7.6 Nh vi bt k d liu vo khc trong cc ng dng cn hnh, trao i d liu d liu in t l i tng c thm tra, vc cc kim tra khc nh l mt phn ca tin trnh bi ng dng vn hnh chp nhn n. Kt qu l, thi im mi trng thng tin nhn thy d liu ny, n c ng ha vo d liu thi gian thc ni b. Nh trong hnh 13, do , trao i d liu d liu in t c tng tc vi kho d liu.

Hnh 13: The data warehouse and external data

Trang -43127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse 3.4.Siu d liu trong kho d liu (Metadata in the Data warehouse) Siu d liu c yu cu c kin trc 3 lp. Tuy nhin khng phi tt c cc kho d liu u yu cu tt c cc lp ny.

Hnh 14: The placement of metadata of the three layer architecture Hnh 14 l cc yu kin trc thng thng ca vic xy dng siu d liu thi gian, bao gm nh ngha 3 lp gii thch mi quan h gia chng. y l kh nng s dng cc cng c m hnh khc nhau cho cc mi trng khc nhau, nhng siu d liu nh ngha phi c thng nht. 3.5. Danh mc kho d liu (The data warehouse catalog -DWC): Trong tp cc siu d liu c nh ngha, c th xc nh mt tp con c th s dng v qun l ca kho d liu. Tp con ny c gi bng nhiu tn, v d nh th mc d liu cng vic, Th mc thng tin cng vic, th mc thng tin. Mt s nhng thut ng ny ch l mt phn trong vic s dng tp cc siu d liu c a vo kho d liu Chng ta tp trung vo cc ni dung ca cc siu d liu, v s dng Danh mc kho d liu - DWC m t ny tp con ny. DWC cha tt c cc siu d liu cn thit s dng v qun l cc kho d liu. Nh vy, bao
Trang -44127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse gm tt c cc siu d liu s dng v mt phn ca siu d liu iu khin lin kt vi cc kho d liu cng vic v kho thng tin cng vic, cng nh mt phn ca siu d liu s dng lin kt vi cc h thng vn hnh nh hnh 15.

Hnh 15: The data warehouse catalog Siu d liu thi gian xy dng khng bao gm trong DWC v qu trnh xy dng cc kho l phn chia logic t qu trnh s dng v qun l n. Tuy nhin, phn ln thi gian xy dng siu d liu c nhn i trong s kim sot v cc thnh phn s dng. Mt s siu d liu iu khin trong mi trng thng tin cng c loi tr khi DWC bi v siu d liu ny tn ti ch h tr cc thnh phn c bn. Cc phn ca siu d liu iu khin bao gm lin quan n vic lp k hoch v tin t ca d liu. DWC cng bao gm mt phn ca siu d liu s dng ca cc h thng vn hnh. phn ny m t vic s dng c th ca d liu trong mi trng hot ng c th khc bit vi trong mi trng thng tin, nhng c gi tr cho ngi s dng hiu ngun gc cui cng ca d liu ca h.
Trang -45127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse DWC v cc phng tin m ngi dng cui truy cp v s dng n l thnh phn quan trng trong thc hin bt k kho d liu. N u cung cp cho ngi s dng kh nng s dng hiu qu cc d liu cng vic c lu tr trong kho. 3.6. Cc h thng vn hnh (Operational systems) Mc d nm bn ngoi ca kho d liu, cc h thng vn hnh l ngun ch yu ca kho d liu. Cu trc v kin trc ca cc h thng vn hnh l nhn t chnh trong vic xc nh phc tp ca vic thc hin mt kho d liu. Mt bc c bn ca kho d liu l cc h thng vn hnh khng yu cu thit k li bt k mt quy m no theo vic xy dng kho d liu. Hng kin trc ca cc h thng vn hnh thng bt ngun t thit k kho d liu ca n. 3.7.Chc nng kho d liu (Data warehouse functionality): Trong nghin cu v kin trc logic, chng ta tp trung vo cc kha cnh lin quan n d liu, do tm quan trng ca s gn kt, nht qun, v tch hp ca d liu trong kho. Mc quan trng ca chc nng cn thit h tr kin trc d liu nh m t. Phn ny gii thiu v xc nh v tr cc chc nng ny. Hnh 7.8. th hin kin trc 3 lp cho kho d liu cng vic, c m rng bao gm siu d liu. N c n gin ha lm ni bt s r rng ca kin trc. C nhng im ging nhau c bn gia cc quy trnh ca s ph bin cc mc tiu khc nhau, v s dng mt tp cc cng c ti to d liu. Tuy nhin, cng c s khc bit ng k gia cc loi khc nhau ca v tr. V d, v tr kho d liu cng vic i hi phi tng cng ng k phc tp ca d liu trong giai on ti to hn so vi v tr ca kho thng tin cng vic. Tng t nh vy v tr ca danh mc kho d liu (DWC) t i hi v thi gian hn so vi v tr ca kho d liu cng vic, kho thng tin kinh cng vic. iu ny a n s khc bit v chc nng gia v tr ca BDW, BIW, v DWC nh trong hnh 16.
Trang -46127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse

Hnh 16: The population functionality of the warehouse S m rng th hai ca cc chc nng cung cp cho vic truy cp v s dng cc d liu cng vic v siu d liu trong kho. Nhng ngi s dng cui s dng kho d liu cng vic v siu d liu theo nhng cch khc nhau. Trong khi d liu cng vic c tm kim v phn tch, siu d liu c khm ph (nhng khng phn tch), t nm c cc d liu cng vic. Nhng s dng khc nhau dn n hai thnh phn chc nng. Giao din thng tin cng vic (BII) cung cp chc nng cn thit cho d liu cng vic. trong khi cc hng dn thng tin cng vic (BIG) cung cp chc nng cn thit cho siu d liu. BII (Business information interface) l giao din truy cp n d liu cng vic. BIG (Business information guide) cung cp cc chc nng cn thit s dng danh mc cc kho d liu trong mt s cch tm d liu cng vic lin quan, nm c quan trng v li ch t vic s dng n. Chc nng ny yu cu cc truy cp phc tp hn n Danh mc kho d liu (DWC).
Trang -47127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse

Hnh 17: The complete logical architecture of the warehouse Qun l kho d liu ( Data warehouse management) bao gm mt s cc chc nng vn hnh v qun l ton b mi trng kho d liu v cc thnh phn c bn c nh ngha. Bao gm: - Truy cp d liu (Data access): Mt s nh dng vt l v v tr trong d liu c th yu cu cc thnh phn truy cp d liu. - Qun l tin trnh (process management): l cn thit trong cc hot ng phi hp, thng vn hnh trong cc nn khc nhau. - Vn chuyn d liu (Data transfer) Chc nng vn chuyn d liu l yu cu di chuyn d liu vt l vo trong v bn trong phm vi kho d liu. N cung cp lp vn chuyn cn thit cho chc nng xc nh v tr, h tr c v s lng ln c vn chuyn cc mc. - An ton (Security) Kho d liu bao gm s hu d liu ton vn ca t chc, an ton l yu cu iu khin truy cp v s dng d liu trong . - Qun l c s d liu (Data management) V kho d liu c m t vt l nh mt tp cc d liu c bn, tp trung v c phn loi nn chc nng qun l c s d liu l bt buc phi c.
Trang -48127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse Chng 4. 4.1. Khi nim OLAP l "On Line Analytical Processing". l H chuyn x l phn tch d liu trc tuyn. V Data warehouse chnh l u vo cho qu trnh x l phn tch trc tuyn. Do nhu cu phn tch d liu trc y hoc cc d liu hin ti nhm h tr cho vic ra quyt nh tht chnh xc, ng lc, gim ri ro. y cng l nhu cu ln nht mi doanh nghip nhm phc v cc quyt nh chin lc cho cng ty. Nht l cc cng ty sn xut ln vi khi lng d liu ln. 4.2. Bn cht ca OLAP Bn cht ct li ca OLAP l d liu c ly ra t Kho d liu hoc t d liu ch (Datamart) sau c chuyn thnh m hnh a chiu v c lu tr trong mt kho d liu a chiu. 4.3. OLAP tp trung vo cc cu lnh sau: Thu nh (roll-up): v d: nhm d liu theo nm thay v theo qu. M rng (drill-down): v d: m rng d liu, nhn theo thng thay v theo qu. Ct lt (slice): nhn theo tng lp mt. V d: t danh mc bn hng ca Q1, Q2, Q3, Q4 ch xem ca Q1 Thu nh (dice): b bt mt phn ca d liu ( tng ng thm iu kin vo cu lnh WHERE trong SQL). 4.4. i tng chnh ca OLAP i tng chnh ca OLAP l khi, mt s biu din a chiu ca d liu chi tit v tng th. Mt khi bao gm mt bng s kin (Fact), mt hoc nhiu bng chiu (Dimensions), cc n v o (Measures) v cc phn hoch (Partitions). 4.4.1. Khi (Cube) Khi l phn t chnh trong x l phn tch trc tuyn, l tp con d liu t kho d liu, c t chc v tng hp trong cc cu trc a chiu.
Trang -49127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

NGN NG CHO KHO D LIU

Tm hiu v Data Warehouse xc nh mt khi, ta chn mt bng Fact v cc n v o lng ng nht (cc ct s theo s quan tm ca ngi dng khi) trong bng Fact. Sau chn cc chiu, mi chiu gm mt hay nhiu ct t bng lin quan khc. Cc chiu cung cp m t r rng bi cc n v o lng c chia ra ca ngi dng khi. Mi chiu c th cha mt h thng cc cp ch s phn chia r rng ca ngi dng. Mi cp trong chiu li chi tit hn mc cha ca n. V d: lc a cha cc quc gia, cc bng hay cc tnh cha cc thnh ph. Tng t, h thng chiu thi gian c th gm c cc cp nm, qu, thng v ngy. 4.4.2. Chiu (Dimension) Cc chiu l cch m t chng loi m theo cc d liu s trong khi c phn chia phn tch. Khi xc nh mt chiu, chn mt hoc nhiu ct ca mt trong cc bng lin kt (bng chiu). Nu ta chn cc ct phc tp th tt c cn c quan h vi nhau, chng hn cc gi tr ca chng c th c t chc theo h thng phn cp n. xc nh h thng phn cp, sp xp cc ct t chung nht ti c th nht. V d: mt chiu thi gian (Time) c to ra t cc ct Nm, Qy, Thng, Ngy (Year, Quarter, Month v Day). Mi ct trong chiu gp phn vo mt cp cho chiu. Cc cp c sp t theo nt ring bit v c t chc trong h thng cp bc m n tha nhn cc con ng hp logic cho vic o su (drill_down). V d: chiu thi gian c miu t trn cho php ngi dng khi o su (drill_down) t Nm ti Qy, t Qy ti Thng v t Thng ti Ngy. Mi drill_down cung cp nt c trng hn. Chiu c phn cp: Phn cp l ct sng ca vic gp d liu hay ni mt cch khc l da vo cc phn cp m vic gp d liu mi c th thc hin c. Phn ln cc chiu u c mt cu trc a mc hay phn cp. Nu chng ta lm nhng quyt nh v gi sn phm ti a doanh thu th chng ta cn quan st nhng d liu v doanh thu sn phm c gp theo gi sn phm, tc l chng ta thc hin mt cch gp. Khi cn lm nhng
Trang -50127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse quyt nh khc th chng ta cn thc hin nhng php gp tng ng khc. Nh vy c th c qu nhiu tin trnh gp. Th nn cc tin trnh gp ny cn phi c thc hin mt cch rt d dng, linh hot c th h tr nhng phn tch khng hoch nh trc. iu ny c th c gii quyt trn c s c s tr gip ca nhng phn cp rng v su. Roll_up v Drill_down da trn phn cp chiu: Da trn phn cp theo chiu, t mt mc di, chng ta c th cun ln (Roll_up) cc mc trn, thc hin mt php gp, c c kt qa tng hp hn. V t mt mc trn, c th khoan su xung (Drill_down) cc mc di, c cc kt qu chi tit hn. 4.4.3. Cc n v o lng (Measures) Cc n v o ca khi l cc ct trong bng Fact. Cc n v o lng xc nh nhng gi tr s t bng Fact m c tng hp phn tch nh nh gi, tr gi, hoc s lng bn. 4.4.4. Cc phn hoch (Partitions) Tt c cc khi u c ti thiu mt phn hoch cha d liu ca n; mt phn hoch n c t ng to ra khi khi c nh ngha. Khi ta to mt phn hoch mi cho mt khi, phn hoch mi ny c thm vo trong tp hp cc phn hoch tn ti i vi khi. Khi phn nh d liu c kt ni c trong tt c cc phn hoch ca n. Mt bng phn hoch ca khi l v hnh i vi ngi dng. Cc phn hoch tiu biu cho mt cng c mnh, mm do cho vic qun tr cc khi OLAP, c bit cc khi ln 4.4.5. Mt v d v t chc kho d liu trong h thng gio dc Trong phn ny trnh by v . Theo truyn thng, cc t chc, c quan gio dc khng tp trung vo tng thu nhp v li ch, nhng li quan tm nhiu n gi tr gia tng v mi quan h cnh tranh v cht lng gio dc trong s thu ht v duy tr cht lng sinh vin. Trn thc t, mi quan tm mnh m n s hiu bit v mi quan h khng thuc phm vi gio dc. Nhng cng c mt bao qut cn thit hiu cc khch hng sinh vin ca
Trang -51127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse chng ta l ai, mua nhng kha hc no. Cui cng, chng ta s c tm nhn cao cho vic s dng cc iu kin thun li nht ca mt trng i hc. Sau y l cc c tnh ring bit ca mt bng fact: - Mi hng th hin lch s hon thnh ca mt thng tin. - Mt bng fact l thch hp nht cho qu trnh tn ti trong thi gian ngn, nh cc yu cu hoc ha n. - Cc tp khng gii hn ca cc bng fact tch ly cc n v o quan tm. - Mi hng c duyt li hoc thay i khi c mt s kin xy ra. - C kha ngoi v cc bng fact tch ly c th thay i trong qu trnh duyt. Trong qu trn theo di n xin vic, cc sinh vin tng lai xc tin thng qua mt tp chun hng trm, hng nghn h s. C th chng ta quan tm n phm vi hot ng xung quanh cc kha thi gian nh: receipt of preliminary

admissions test scores, nformation requested (via Web or otherwise), information sent, interview conducted, on-site campus visit, application received, transcript received, test scores received, recommendations received, first pass review by admissions, review for financial aid, final decision from admissions, accepted, admitted, and enrolled.. Ti bt k thi im no, mi ngi c tha nhn v c kt np vng qun l c quan tm n vic c bao nhiu n xin vic ti mi giai on trong qu trnh. Nhng ngi c php cng c th phn tch s thiu n xin vic bng rt nhiu cc c tnh. Khuynh hng ca s tch ly nhanh lu vt vng i ca n xin vic ti mt hng cho mt sinh vin tng lai. Th hin ny mc thp nht ca chi tit c nm gi khi cc trin vng vo sp xy ra. Rt nhiu thng tin c thu thp trong tin ti ng dng, s chp nhn v cho php, chng ta tip tc duyt li v cp nht cc trng thi trin vng trong hng ca bng fact. Hnh sau:

Trang -52127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse

Hnh 18: Studen applicant pipeline as an accumulating snapshot C rt nhiu chiu thi gian trong bng fact tng ng vi cc giai on quan trng x l chun. Chng ta mun phn tch s tin ti trin vng bng thi gian xc nh bc di chuyn thng qua knh cung cp, v chng ta cng mun pht hin ra nhng con ng hp. iu ny c bit quan trng nu chng ta thy tr quan trng lin quan n ng c m chng ta quan tm thu ht. Mi mt thi gian ny c xem xt nh mt roleplaying dimention, s dng cc kha i din nm c nhng thi gian khng xy ra khi dng u tin c xem n. Chiu ca n xin vic bao gm mt s thuc tnh quan tm bao gm cc sinh vin kh nng. Cc phn tch cho php c quan tm trong cc lt ct, khi nh ca cc c tnh n xin vic bi v tr a l, kh nng xut pht, gii tnh, ngy sinh, dn tc, v s kho chnh. Phn tch cc c tnh ny ti mt s giai on ca knh cung cp s gip iu chnh c nhn c

Trang -53127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse php iu chnh cc chin lc ca h ng vin nhiu sinh vin t c im thi ua tip theo. Cc bng fact thc t (Factless Fact Tables) Chng ta thit k cc bng fact vi mt s cu trc c tnh. Mi bng thng c ba n khong 15-20 ct kha, tip theo bi mt hoc nhiu ch s, cc gi tr tip theo, tt nht l thm cc s kin. Cc d kin c th c coi l php o ti s giao nhau ca ca cc gi tr kha chiu. T quan im ny, cc s kin chng minh cho bng fact, v cc gi tr kha l cu trc iu khin qun l xc nh cc s kin. Cc s kin cho sinh vin ng k C nhiu tnh hung trong cc s kin cn phi c ghi li, ng thi gn lin vi nhau ca mt s chiu xc nh. V d, chng ta c th theo di hc sinh ng k theo mt thi hn. Khuynh hng ca bng fact s l mt hng cho mi kha hc ng k ca sinh vin v thi hn. Nh minh ha trong hnh-12.2, bng thc t a chiu gm: thi hn, sinh vin, chuyn ngnh ca sinh vin, kha hc, v ging vin. Chng ta ang lm vic vi d liu thc t mc gii hn hn l theo lch, ngy, tun, hoc thng. Thi hn l mc thp nht c sn cho cc s kin ng k. Chiu ca thi hn phi ph hp n chiu ngy trong lch. Ni cch khc, mi ngy trong lch hng ngy ca chng ta, v gii hn ma ca nm hc.

Hnh 19:Student registration events as a factless fact table.


Trang -54127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse Bao trm s tn dng c s vt cht Kiu th hai ca bng fact thc t nht c a ra trong bng s kin. Chng ta a ra chui s kin phn chia vi qun l c s vt cht phc v cho mt minh ha. Cc trng i hc dnh mt lng vn ln trong cc d n c nh v c s vt cht. N c th d hiu khi c s vt cht c s dng cho mc ch no trong sut thi gian . V d, c s vt cht c s dng nhiu nht l g? T l s hu trung bnh ca c s vt cht trong chc nng thi gian l bao nhiu? S gim gi ng k vo th 6 khi khng c ai n dy ti cc lp hc l bao nhiu? Bng fact km thc t c th b gii phng. Trng hp ny bao gm cc hng trong bng fact m mi c s vt cht cho khi thi gian chun trong mi ngy ca mi tun khng c dng ti khi c s vt cht c dng hoc khng. Minh ha trong hnh 20 Chiu c s vt cht bao gm tt c cc kiu ca thuc tnh m t v c s vt cht, nh ton nh, kiu c s vt cht ( VD nh phng hc, phng lab hoc vn phng), s m2, kh nng cha, v tin nghi (my chiu, bng trng..). Chiu ca trng thi tn dng trc bao gm dng mt t vi gi tr C kh nng (available) hoc c tn dng (Utilized). Rt nhiu t chc c th lin quan n s tn dng c s vt cht. C th nh: mt t chc s hu c s vt cht trong khi thi gian, khi m mt t chc khc cng ng k ngi dng c s vt cht.

Hnh 20: Facilities utilization as a coverage factless fact table

Trang -55127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse Cc s kin c mt ca sinh vin: Chng ta c th tng tng gin ghi vt s c mt ca sinh vin trong mt kha hc. Trong trng hp ny mt thnh phn c th l mt hng cho mi sinh vin i hc qua cc phng hc theo kha mi ngy. Bng fact s kin yu ny c th chia s cc chiu ging nhau chng ta tho lun vi kha cnh cc s kin ng k. S khc nhau c bn ca mi thnh phn l theo ngy lch hn l theo ma. M hnh chiu ny, c minh ha trong hnh sau, cho php chng ta tr li cu hi l kha hc no c sinh vin hc ng nht? Nhng sinh vin no ng k vo cc kha hc no? Nhng gio vin no dy phn ln cc sinh vin?

Hnh 21: Bng s kin c mt ca sinh vin (Student attendance fact table) Mt s lnh vc phn tch ng quan tm Mt s x l phn tch khc c th thc hin trong v d ny nh: cc ti nguyn con ngi v s thu nhn, l cc kh nng p dng trc cho mi trng gio dc i hc a ra mong mun chi ph iu hnh v qun l tt hn. Khi chng ta tp trung vo cch tnh thu nhp, cc h tr cho nghin cu, vn nghin cu, s nghin cu ca ging vin, v thu nhp t hc ph,..

Trang -56127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse

KT LUN
Trong thi gian thc hin ti, em tm hiu v trnh by c cc vn : 1. Tng quan v kho d liu: nh khi nim, c im, li ch, mc tiu, tnh cht , thnh phnca kho d liu 2. Cc khi nim c bn trong kho d liu 3. Kin trc logic kho d liu 4. Ngn ng cho kho d liu v mt minh ha cho vic t chc kho d liu. n bc u gii thiu nhng kin thc c bn v kho d liu, gip ngi c c ci nhn tng quan v cn bn nht v kho d liu v cc khi nim lin quan. Tuy nhin do hn ch v iu kin thi gian v kin thc, n khng th trnh khi nhng thiu xt.V vy em mong nhn c nhng kin ng gp ca cc thy c gio cng ton th cc bn. Em xin chn thnh cm n!

Trang -57127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

Tm hiu v Data Warehouse

TI LIU THAM KHO


1. Barry Devin, Data Warehouse, Addison Wesley, 1997. 2. Ralph Kimball, Margy Ross, The Data Warehouse Toolkit, pp 1-65,

243-254, John Wiley & Sons, Inc, 2002.


3. http://vi.wikipedia.org/wiki/Kho_d%E1%BB%AF_li%E1%BB%87u 4. W. H. Inmon, OLAP and Data Warehouse, 2000.

Trang -58127.0.0.1 downloaded 60.NGUYENTHIMAIHUONG_CT1001.pdf at Mon Jul 16 22:39:59 ICT 2012

You might also like