Apriori Algorithm in Data Mining with examples
Apriori Algorithm in Data Mining with examples
Apriori Helps in mining the frequent itemset.
Example of Apriori Algorithm
Let’s see an example of the Apriori Algorithm.
Minimum Support: 2
Step 1: Data in the database
Step 2: Calculate the support/frequency of all items
Step 3: Discard the items with minimum support less than 2
Step 4: Combine two items
Step 5: Calculate the support/frequency of all items
Step 6: Discard the items with minimum support less than 2
Step 6.5: Combine three items and calculate their support.
Step 7: Discard the items with minimum support less than 2
Result:
Only one itemset is frequent (Eggs, Tea, Cold Drink) because this itemset has minimum support 2
Example 2 of Apriori Algorithm
Let’s see another example of the Apriori Algorithm.
Minimum Support :3
Step 1: Data in the database
Step 2: Calculate the support/frequency of all items
Step 3: Discard the items with minimum support less than 3
Step 4: Combine two items
Step 5: Calculate the support/frequency of all items
Step 6: Discard the items with minimum support less than 3
Step 6.5: Combine three items and calculate their support.
Step 7: Discard the items with minimum support of less than 3. So all itemsets are excluded except “Eggs, Cold drink” because this itemset has the support of 3.
Result:
There is no frequent itemset because all itemsets have minimum support of less than 3.
Advantages of Apriori Algorithm
Apriori Algorithm is the simplest and easy to understand the algorithm for mining the frequent itemset.
Apriori Algorithm is fully supervised.
Apriori Algorithm is fully supervised so it does not require labeled data.
Apriori Algorithm is an exhaustive algorithm, so it gives satisfactory results to mine all the rules within specified confidence and sport.
Apriori principles
Downward closure property of frequent patterns
Downward closure property of frequent patterns, means that All subset of any frequent itemset must also be frequent.
Example of Downward closure property
If Notebook, Pencil, School Bag is a frequent itemset, then we can say that all of the following itemsets are frequent;
- Notebook
- Pencil
- School Bag
- Notebook, Pencil
- Notebook, School Bag
- Pencil, School Bag
Apriori pruning principle
If an itemset is infrequent, its superset should not be generated for getting the frequent itemset.
Examples of Apriori pruning principle
If Notebook, Pencil is a frequent itemset and School Bag is not frequent itemset, then we can say that all of the following itemsets are frequent;
- Notebook
- Pencil
- Notebook, Pencil
Apriori Candidates generation
Candidates can be generated by the self joining and Apriori pruning principles.
Step 1:
Self-joining of Apriori Candidates
Example of self-joining
A1 B1 C1 D1 E1
C1={A1 B1 C1, A1 B1 D1, A1 C1 D1, A1 C1 E1, B1 C1 D1}
Self-joining = C1 * C1A1 B1 C1 D1 from A1 B1 C1 and A1 B1 D1A1 C1 D1 E1 from A1 C1 D1 and A1 C1 E1
So frequent candidates are A1 B1 C1 D1 and A1 C1 D1 E1
Step 2:
Apriori pruning principle
Example of Apriori pruning principle
A1 B1 C1 D1 E1C1={A1 B1 C1, A1 B1 D1, A1 C1 D1, A1 C1 E1, B1 C1 D1} According to Apriori Pruning principle A1 C1 D1 E1 is remoA1ed because A1 D1 E1 is not in C1.
So frequent candidate is A1 B1 C1 D1
Important interview questions of Apriori Algorithm
Let’s see some important interview questions of Apriori Algorithm.
- What are the frequent itemsets?
- What is the apriori algorithm?
- What is the role of the apriori algorithm in data mining?
- Give some examples of the apriori algorithm in data mining.
- What are the advantages of the apriori algorithm?
- What are the disadvantages of the apriori algorithm?
- How does the Apriori algorithm help in mining the frequent itemset?
- What is Apriori pruning principle?
- How we can generate the Apriori candidates?
- What is self joining of candidates in Apriori?
Implementation of the Apriori Algorithm in C++
This is the demo of Apriori algorithm in which we are taking the list of 5 lists of purchases items and getting the result of apriori.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 | #include<iostream> #include<conio.h> using namespace std; int main() { int m,i,k,l,j,t1,f; int list1_found,list2_found,list3_found; //Initial item-purchase int apriori[5][5]; for(i=0;i<5;i++) { cout<<"\n Enter list of purchased items"<<i+1<<":"<<endl; for(j=0;j<5;j++) { cin>>apriori[i][j]; } } //Defining minimum level for acceptence int min; cout<<"\n Enter minimum acceptance level"<<endl; cin>>min; //Printing initial input cout<<"\nInitial Input:\n"<<endl; cout<<"\nTrasaction\tItems\n<<endl"; for(i=0;i<5;i++) { cout<<i+1<<":\t"; for(j=0;j<5;j++) { cout<<apriori[i][j]<<"\t"; } cout<<"\n"; } cout<<"\nAssume minimum support: "<<min<<endl; //First turn int list_1[5]; for(i=0;i<5;i++) { t1=0; for(j=0;j<5;j++) { for(k=0;k<5;k++) { if(apriori[j][k]==i+1) { t1++; } } } list_1[i]=t1; } //Printing first turn cout<<"\n\nGenerating C1 from data\n"<<endl; for(i=0;i<5;i++) { cout<<i+1<<": "<<list_1[i]<<"\n"<<endl; } //Second turn //Counting number of possibilities for turn2 int p2pcount=0; int list2_Purchased_Items[5]; int p2pos=0; for(i=0;i<5;i++) { if(list_1[i]>=min) { p2pcount++; list2_Purchased_Items[p2pos]=i; p2pos++; } } //Printing selected items for second turn cout<<"\nGenerating LIST_1 From C1\n"<<endl; for(i=0;i<p2pos;i++) { cout<<list2_Purchased_Items[i]+1<<"\t"<<list_1[list2_Purchased_Items[i]]<<"\n"<<endl; } //Joining items int list2[5][3]; int firstItem2join; //will hold first item for join int secondItem2join; //will hold second item for join int list2pos1=0; //position pointer in list2 array int list2ocount=0; //product join occruance counter int list2jcount=0; //join counter for(i=0;i<p2pcount;i++) { for(j=i+1;j<p2pcount;j++) { firstItem2join=list2_Purchased_Items[i]+1; secondItem2join=list2_Purchased_Items[j]+1; if(firstItem2join==secondItem2join) { //it is self join continue; } //join the items list2[list2pos1][0]=firstItem2join; list2[list2pos1][1]=secondItem2join; list2jcount++; //count occurances list2ocount=0; //reset counter for(k=0;k<5;k++) { list1_found=list2_found=0; //resetting flag //scan a purcahse for(l=0;l<5;l++) { if(firstItem2join==apriori[k][l]) { //one of the element found list1_found=1; } if(secondItem2join==apriori[k][l]) { //second items also found list2_found=1; } } //one purchase scanned if(list1_found==1&&list2_found==1) //both items are present in purchase { list2ocount++; } } //assign count list2[list2pos1][2]=list2ocount; list2pos1++; } } //Printing second turn cout<<"\n\nGenerating LIST2\n"<<endl; for(i=0;i<list2jcount;i++) { for(j=0;j<3;j++) { cout<<list2[i][j]<<"\t"; } cout<<"\n"; } //Third turn int p3pcount=0; int Purchased_Items[5]={-1,-1,-1,-1,-1}; int p3pos=0; for(i=0;i<5;i++) { if(list2[i][2]>=min) { f=0; for(j=0;j<5;j++) { if(Purchased_Items[j]==list2[i][0]) { f=1; } } if(f!=1) { Purchased_Items[p3pos]=list2[i][0]; p3pos++; p3pcount++; } f=0; for(j=0;j<5;j++) { if(Purchased_Items[j]==list2[i][1]) { f=1; } } if(f!=1) { Purchased_Items[p3pos]=list2[i][1]; p3pos++; p3pcount++; } } } //Joining int list3[5][4]; int list3ocount=0; //occurance counter int list3jcount=0; //join counter for(i=0;i<p3pcount;i++) { for(j=i+1;j<p3pcount;j++) { for(k=j+1;k<p3pcount;k++) { list3[i][0]=Purchased_Items[i]; list3[i][1]=Purchased_Items[j]; list3[i][2]=Purchased_Items[k]; list3jcount++; //count occurances list3ocount=0; //reset counter for(k=0;k<5;k++) { list1_found=list2_found=list3_found=0; //resetting flag //scan a purcahse for(l=0;l<5;l++) { if(list3[i][0]==apriori[k][l]) { //one of the element found list1_found=1; } if(list3[i][1]==apriori[k][l]) { //second items also found list2_found=1; } if(list3[i][2]==apriori[k][l]) { //third element also found list3_found=1; } } //one purchase scanned if(list1_found==1&&list2_found==1&&list3_found==1) //all items are present in purchase { list3ocount++; } } //assign count list3[i][3]=list3ocount; } } } //Printing second turn cout<<"\n\nGenerating LIST3\n"; for(i=0;i<list3jcount;i++) { for(j=0;j<4;j++) { cout<<list3[i][j]<<"\t"; } cout<<"\n"; } getch(); } |
Output
Video Lecture
Next Similar Tutorials
- Frequent pattern Mining, Closed frequent itemset, max frequent itemset in data mining – Click Here
- Support, Confidence, Minimum support, Frequent itemset, K-itemset, absolute support in data mining – Click Here
- Apriori Algorithm in Data Mining with examples – Click Here
- Apriori principles in data mining, Downward closure property, Apriori pruning principle – Click Here
- Apriori candidates’ generations, self-joining, and pruning principles. – Click Here.