Hi,
I am implementing an aggregation model to compute data cubes.
I have implemented the multiway array aggregation statically. That is if there are, say, 4 dimensions each of cardinality 16 and the chunk size of 4, I create chunkArray[4][4][4][4] and insert the measure at the appropriate position and aggregate the measure. The chunkArray is a sparse array and static implementation is not efficient. In order the dynamically create the chunkArray with measure stored only for the instances that occur I need the offset of each of the cell in the chunk.
Eg: Dimension 3
A[8], B[8], C[8] - each dimension with cardinality 8.
If the relational database table is fully populated then there will be 8*8*8 instances in the table.
If not, then some of the instances will be missing.
a1,b1,c1,20
a1,b1,c2,3
.......
.......
a1,b1,c8,12
This will be like the truth table with 3 variables each of which can take upto 8 values as compared to 2 values in the digital truth table.
The aggregation will be done on chunks. If I have instances a1,b1,c1 - a1,b1,c3 - a1,b1,c8 - some of the instances will be missing. What I need then is an offset of the instance from the first instance.
a1,b1,c1 offset=0
a1,b1,c3 offset=2
a1,b1,c8 offset=7
Suppose the chunk size is 2 in all dimensions, the the total number of chunks will be 8. I will bring in the first 8 chunk cells and aggregate. But the problem is, in the first 8 cells only 3 are dense and rest 5 are not there. I am stuck in finding the offset of the chunk cell. If you have any idea, please let me know.
I am implementing an aggregation model to compute data cubes.
I have implemented the multiway array aggregation statically. That is if there are, say, 4 dimensions each of cardinality 16 and the chunk size of 4, I create chunkArray[4][4][4][4] and insert the measure at the appropriate position and aggregate the measure. The chunkArray is a sparse array and static implementation is not efficient. In order the dynamically create the chunkArray with measure stored only for the instances that occur I need the offset of each of the cell in the chunk.
Eg: Dimension 3
A[8], B[8], C[8] - each dimension with cardinality 8.
If the relational database table is fully populated then there will be 8*8*8 instances in the table.
If not, then some of the instances will be missing.
a1,b1,c1,20
a1,b1,c2,3
.......
.......
a1,b1,c8,12
This will be like the truth table with 3 variables each of which can take upto 8 values as compared to 2 values in the digital truth table.
The aggregation will be done on chunks. If I have instances a1,b1,c1 - a1,b1,c3 - a1,b1,c8 - some of the instances will be missing. What I need then is an offset of the instance from the first instance.
a1,b1,c1 offset=0
a1,b1,c3 offset=2
a1,b1,c8 offset=7
Suppose the chunk size is 2 in all dimensions, the the total number of chunks will be 8. I will bring in the first 8 chunk cells and aggregate. But the problem is, in the first 8 cells only 3 are dense and rest 5 are not there. I am stuck in finding the offset of the chunk cell. If you have any idea, please let me know.
Comment