First there is a large amount of data, which is divided into 3 units:
類型 第一個值 第二值
For example, there are 100 pieces of such data now. How to deal with it?
Step one
My idea is to construct 1:n:1 through a dictionary.
However, the current situation is. If the type is key, what should be done if there are the same type? Because the dictionary key is unique.
The second step is that I want to get the second value through the first value. How can I get the second value from the first value without knowing the first value?
Thanks! ~
The expression may not be clear. Let me give an example of 1:n:1
For example, there are two pieces of data like this
類型 第一個值 第二個值
(1) zhangsan 2017-01-01 是的我來了
(2) zhangsan 2017-05-01 我要走了
Then I want to compose something like this:
'zhangsan':{{'2017-01-01': '是的我來了'},{'2017-05-01':'我要走了'}}
This is the result I want.
That is to say. 1: n: 1 means Type: first value: second value
I don’t know if there is a feasible solution
走同樣的路,發(fā)現(xiàn)不同的人生
The first one: If the first value is unique under the same type, then you can try the following structure:
value = '1:n:1'
mapping = defaultdict(dict)
value_type, fir_val, sec_val = value.split(':')
mapping = {
value_type: {
fir_val: sec_val
}
}
The above means that creating a default value 字典
的字典mapping
, 然后用類型
和第一個值
分別作為索引的key, 我個人覺得, 在類型
和第一個值
的共同約束下, 找到的第二個值
should be unique. If it is not unique, then you need to consider whether to overwrite it or exist in the form of a list
Second:
If you use the data structure I mentioned above, then without knowing the first value, you can only traverse mapping[value_type], and then determine whether the value is what you want
What does 1:n:1 mean? Mapping relationship? But this is not important. I have a simple and crude way to write each piece of data as a 3-tuple (type, val1, val2), and then save each piece of data into an array [];
Under construction When making an array, make 3 dicts, typeDict = {type: [arrIdx]}, val1Dict={val1:[arrIdx]}, val2Dict={val2:[arrIdx]}
When you want to use type to find data, just use typeDict from Find all record locations corresponding to type.
Similarly, val1 and val2 are the same.
When you want to find typeA, val1=n, you only need to intersect the result sets found from typeDict, val1Dict.
If there is a large amount of data, it seems that it would be more efficient to use a database such as mysql or pandas that specializes in data processing. Pandas also has a dedicated to_dict function.