国产亚洲欧美日韩精品一区二区,欧美疯狂性受xxxxx喷水,√新版天堂资源在线资源

亚洲国产日韩欧美一区二区三区,精品亚洲国产成人av在线,国产99视频精品免视看7,99国产精品久久久久久久成人热,欧美日韩亚洲国产综合乱

Python新手問(wèn)題——大txt文件按條件將多行合并

迷茫 2017-04-18 10:34:26

718

數(shù)據(jù)格式如下：
······
1107 1385331000000 1.3142511607126754
1107 1385331000000 0.0021683196661660157
1107 1385331600000 0.0021683196661660157
1107 1385331600000 1.4867805985670923
1107 1385331600000 0.0021683196661660157
1107 1385332200000 1.1697626938303243
1107 1385332800000 0.0021683196661660157
1107 1385332800000 0.005813069022279304
1107 1385332800000 1.2847329440609827
1107 1385332800000 0.0021683196661660157
1107 1385333400000 1.2891586380834603
1108 1385247600000 0.026943168177151356
1108 1385247600000 6.184696475262653
1108 1385248200000 0.05946288920050806
1108 1385248200000 6.359572785335356
1108 1385248200000 0.010602880590260044
1108 1385248800000 0.026943168177151356
1108 1385248800000 5.568047923787272
1108 1385249400000 0 0.01024202685104009
1108 1385249400000 5.213017822855314
1108 1385250000000 0.01024202685104009
1108 1385250000000 5.385327254217893
1108 1385250600000 0.016259860511678353
1108 1385250600000 4.902644074658115
1108 1385251200000 4.141288808488436
1108 1385251800000 0.05388633635430271
1108 1385251800000 4.684096694966861
1108 1385251800000 0.01024202685104009
1108 1385252400000 4.386580113177049
1108 1385253000000 4.582219390797833
1108 1385253600000 5.211061096279831
1108 1385254200000 0.02048405370208018
1108 1385254200000 3.901546051563316
1108 1385254200000 0.01024202685104009
1108 1385254800000 4.0387888693118255
······
每一行數(shù)據(jù)間是tab鍵隔開(kāi)的
第一列是標(biāo)號(hào)，第二列是UTC格式時(shí)間戳，第三列是流量數(shù)據(jù)，每行的數(shù)據(jù)是10分鐘內(nèi)的，現(xiàn)在想把同一個(gè)標(biāo)號(hào)的比如第一列為1107的每一小時(shí)的數(shù)據(jù)疊加起來(lái)成為新的一行（第二列的時(shí)間可以用時(shí)間戳表示或者以時(shí)間間隔表示），完全沒(méi)有頭緒，請(qǐng)大神指點(diǎn)指點(diǎn)

迷茫

業(yè)精于勤，荒于嬉;行成于思，毀于隨。

reply all(8)

黃舟2017-04-18 10:36:26 8 floor

I solved it myself. Although it may be complicated, it can meet the needs

__author__ = 'Administrator'
file = open('day24.txt', 'a+')
s = "area       time            data\n"
file.write(s)
file.close


file = open('sms-call-internet-mi-2013-11-24-24.txt','r')
line = file.readline()
list1 = []#時(shí)間
num1 = []#data
area = []

while 1:
    line = file.readline()
    if line == '':
        break
    a = line.split()
    if int(a[0]) == 1:
        if a[2] == "NA":
            a[2] = '0'
        area.append(a[0])
        if a[1] in list1:
            num1[list1.index(a[1])] = float(num1[list1.index(a[1])])+float(a[2])
        else:
            list1.append(a[1])
            num1.append(a[2])
    elif int(a[0]) < 10001:

        if a[2] == "NA":
            a[2] = '0'
        if a[0] not in area:
            area.append(a[0])

            file1 = open('day24.txt', 'a+')

            for i in list1:
                file1.write("%-8s%-16s%.20f\n" % (area[area.index(a[0])-1], i, float(num1[list1.index(i)])))
            file1.close
            file1 = open('day24.txt', 'r')
            file1.close
            list1 = []
            num1 = []

        if a[1] in list1:
            num1[list1.index(a[1])] = float(num1[list1.index(a[1])])+float(a[2])

        else:
            list1.append(a[1])
            num1.append(a[2])
    else:
        break
file.close

file = open('day24.txt', 'a+')
for j in list1: 
    file.write("%-8s%-16s%.20f\n" % (a[0], j, float(num1[list1.index(j)])))
file.close
file = open('day24.txt', 'r')
file.close

Like +0

Add Reply

左手右手慢動(dòng)作2017-04-18 10:36:26 7 floor

If it is based on time series, just use the generator to read the original file, generate new lines and then output it.

Like +0

Add Reply

左手右手慢動(dòng)作2017-04-18 10:36:26 6 floor

pandas can solve your needs, read the data into a dataframe and then process it

Like +0

Add Reply

劉奇2017-04-18 10:36:26 5 floor

This depends on how much data you have

Use file handle traversal without using readlines() (memory may not be enough)
Use a data structure similar to a dictionary to store your information. If the memory is not enough, you have to find a way to write the intermediate information to disk, etc.

The general idea is as follows

from collections import Counter
c = Counter()
f = ['1107 1385332800000 1.2847329440609827',
'1107 1385332800000 0.0021683196661660157',
'1107 1385333400000 1.2891586380834603',
'1108 1385247600000 0.026943168177151356',
'1108 1385247600000 6.184696475262653',
'1108 1385248200000 0.05946288920050806' ]

'''
with open('xxoo.txt') as f:  # f 文件遍歷句柄，相當(dāng)于上面的 list f
    for i in f:
        s = i.split()
        c[s[0]] += s[2]
'''


for i in f:  # 這里是遍歷 f， 這里遍歷的是 list f， 你實(shí)際情況要用上面的 f
    s = i.split()  # 這里是空格分割，可以使用 print s 看看結(jié)果
    c[s[0]] += float(s[2])  # c 用來(lái)統(tǒng)計(jì)

for i in c:
    print i, c[i]

Like +0

Add Reply

PHPzhong2017-04-18 10:36:26 4 floor

What you are doing is grouping statistics based on two indicators: label and hour. Use pandas to read in, use to_datetime to convert the timestamp into a time column and then get the number of hours. Then use groupby to classify the label and hour at the same time, and sum it up.

Like +0

Add Reply

黃舟2017-04-18 10:36:26 3 floor

Please use this idea
https://www.zhihu.com/questio...

Like +0

Add Reply

阿神2017-04-18 10:36:26 2 floor

I think your data format can be analyzed a little before doing it
1. The first column represents the date, you can use it as the key of the first level of the result array, result[date]
2. The second column should look like Timestamp of time (minutes), so if you require results by hour, you initialize 24 elements for each result[data] item, and the key is the number of hours (the value of the timestamp of the corresponding number of hours can be used as the key) , the key value corresponds to the sum of data within this hour, that is, resultdate
3. After initializing the result array, it is simple. You just traverse the file and process it line by line. For each line, first read the value of the first column. , such as 1107,
operates on result[1107]. Then read the second column, find the corresponding hourtimestamp key, and just add it up.
4. Finally, traverse the result array and output the result.

Like +0

Add Reply

Peter_Zhu2017-04-18 10:36:26 1 floor

You need:

from itertools import groupby

It can be done in less than ten lines of code.

Like +0

Add Reply