질문&답변
클라우드/리눅스에 관한 질문과 답변을 주고 받는 곳입니다.
리눅스 분류

RAID가 에러 나면서 kernel panic에 걸렸습니다.

작성자 정보

  • 하소양 작성
  • 작성일

컨텐츠 정보

본문

 

안녕하세요.

Intel(R) Xeon(TM) CPU 2.40GHz
Fedora Core 3
...인 머신 입니다.

sda1에 RAID로 nextreme 의 ES-A12U-6 를 달아 쓰고 있는데요.
그동안 잘 쓰고 있었는데,
좀 대용량의 카피를 좀 하려고 하니까
갑자기 아래 사진과 같은 메시지를 뱉어내면서 시스템이 죽었습니다. ㅜㅜ
중요한 데이터가 많이 들어있는 RAID 인데.. ㅜㅜ;;;
RAID 콘솔을 보니 HDD 가 깨진 건 아닌 것은 아닌 것 같은데요.
어떻게 대처해야 할지 조언 부탁드립니다.

메시지는 아래와 같습니다. (일부)

Jun 16 15:07:16 raid kernel: (scsi0:A:9:0): parity error detected in Data-in phase. SEQADDR(0x9e) SCSIRATE(0xc2)
Jun 16 15:07:16 raid kernel:  CRC Value Mismatch
Jun 16 15:07:16 raid kernel: (scsi0:A:9:0): Unexpected busfree in Message-out phase
Jun 16 15:07:16 raid kernel: SEQADDR == 0x16b
Jun 16 15:07:16 raid kernel: (scsi0:A:9:0): parity error detected in Data-in phase. SEQADDR(0x9e) SCSIRATE(0xc2)
Jun 16 15:07:16 raid kernel:  CRC Value Mismatch
Jun 16 15:07:16 raid kernel: (scsi0:A:9:0): Unexpected busfree in Message-out phase
Jun 16 15:07:16 raid kernel: SEQADDR == 0x16b
Jun 16 15:07:16 raid kernel: (scsi0:A:9:0): parity error detected in Data-in phase. SEQADDR(0x9e) SCSIRATE(0xc2)
Jun 16 15:07:16 raid kernel:  CRC Value Mismatch
Jun 16 15:07:16 raid kernel: (scsi0:A:9:0): Unexpected busfree in Message-out phase
Jun 16 15:07:16 raid kernel: SEQADDR == 0x16b
.....
Jun 16 15:07:47 raid kernel: scsi0:0:9:0: Attempting to queue an ABORT message
Jun 16 15:07:47 raid kernel: CDB: 0x28 0x0 0xe 0x20 0x80 0x27 0x0 0x0 0x38 0x0
Jun 16 15:07:47 raid kernel: scsi0: At time of recovery, card was not paused
Jun 16 15:07:47 raid kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
Jun 16 15:07:47 raid kernel: scsi0: Dumping Card State while idle, at SEQADDR 0x16b
Jun 16 15:07:47 raid kernel: Card was paused
Jun 16 15:07:47 raid kernel: ACCUM = 0x7, SINDEX = 0x1, DINDEX = 0xe4, ARG_2 = 0x0
Jun 16 15:07:47 raid kernel: HCNT = 0x0 SCBPTR = 0x2
Jun 16 15:07:47 raid kernel: SCSIPHASE[0x0] SCSISIGI[0x0] ERROR[0x0] SCSIBUSL[0x0]
Jun 16 15:07:47 raid kernel: LASTPHASE[0x1] SCSISEQ[0x12] SBLKCTL[0xa] SCSIRATE[0x0]
Jun 16 15:07:47 raid kernel: SEQCTL[0x10] SEQ_FLAGS[0xc0] SSTAT0[0x0] SSTAT1[0x8]
Jun 16 15:07:47 raid kernel: SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x8] SIMODE1[0xa4]
Jun 16 15:07:47 raid kernel: SXFRCTL0[0x88] DFCNTRL[0x0] DFSTATUS[0x89]
Jun 16 15:07:47 raid kernel: STACK: 0x34 0xd2 0x163 0x109
Jun 16 15:07:47 raid kernel: SCB count = 8
Jun 16 15:07:47 raid kernel: Kernel NEXTQSCB = 1
Jun 16 15:07:47 raid kernel: Card NEXTQSCB = 7
Jun 16 15:07:47 raid kernel: QINFIFO entries: 7 2
Jun 16 15:07:47 raid kernel: Waiting Queue entries:
Jun 16 15:07:47 raid kernel: Disconnected Queue entries:
Jun 16 15:07:47 raid kernel: QOUTFIFO entries:
Jun 16 15:07:47 raid kernel: Sequencer Free SCB List: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Jun 16 15:07:47 raid kernel: Sequencer SCB Info:
Jun 16 15:07:47 raid kernel:   0 SCB_CONTROL[0x60] SCB_SCSIID[0x97] SCB_LUN[0x0] SCB_TAG[0x0]
Jun 16 15:07:47 raid kernel:   1 SCB_CONTROL[0x60] SCB_SCSIID[0x97] SCB_LUN[0x0] SCB_TAG[0x3]
Jun 16 15:07:47 raid kernel:   2 SCB_CONTROL[0xe0] SCB_SCSIID[0x97] SCB_LUN[0x0] SCB_TAG[0xff]
Jun 16 15:07:47 raid kernel:   3 SCB_CONTROL[0x60] SCB_SCSIID[0x97] SCB_LUN[0x0] SCB_TAG[0xff]
Jun 16 15:07:47 raid kernel:   4 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff]
Jun 16 15:07:47 raid kernel:   5 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff]

....

Jun 16 15:07:48 raid kernel: Pending list:
Jun 16 15:07:48 raid kernel:   2 SCB_CONTROL[0x70] SCB_SCSIID[0x97] SCB_LUN[0x0]
Jun 16 15:07:48 raid kernel:   7 SCB_CONTROL[0x70] SCB_SCSIID[0x97] SCB_LUN[0x0]
Jun 16 15:07:48 raid kernel:   0 SCB_CONTROL[0x60] SCB_SCSIID[0x97] SCB_LUN[0x0]
Jun 16 15:07:48 raid kernel:   3 SCB_CONTROL[0x60] SCB_SCSIID[0x97] SCB_LUN[0x0]
Jun 16 15:07:48 raid kernel: Kernel Free SCB list: 6 5 4
Jun 16 15:07:48 raid kernel: DevQ(0:9:0): 0 waiting
Jun 16 15:07:48 raid kernel: DevQ(0:10:0): 0 waiting
Jun 16 15:07:48 raid kernel:
Jun 16 15:07:48 raid kernel: <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
Jun 16 15:07:48 raid kernel: (scsi0:A:9:0): Device is disconnected, re-queuing SCB
Jun 16 15:07:48 raid kernel: Recovery code sleeping
Jun 16 15:07:52 raid kernel: Recovery code awake
Jun 16 15:07:52 raid kernel: Timer Expired
Jun 16 15:07:52 raid kernel: aic7xxx_abort returns 0x2003
Jun 16 15:07:52 raid kernel: scsi0:0:9:0: Attempting to queue an ABORT message
Jun 16 15:07:52 raid kernel: CDB: 0x28 0x0 0xe 0x20 0x82 0x2f 0x0 0x0 0x40 0x0
Jun 16 15:07:52 raid kernel: scsi0: At time of recovery, card was not paused
Jun 16 15:07:52 raid kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
Jun 16 15:07:52 raid kernel: scsi0: Dumping Card State while idle, at SEQADDR 0x16b
Jun 16 15:07:52 raid kernel: Card was paused
Jun 16 15:07:52 raid kernel: ACCUM = 0x7, SINDEX = 0x1, DINDEX = 0xe4, ARG_2 = 0x0
Jun 16 15:07:52 raid kernel: HCNT = 0x0 SCBPTR = 0x2
Jun 16 15:07:52 raid kernel: SCSIPHASE[0x0] SCSISIGI[0x0] ERROR[0x0] SCSIBUSL[0x0]
Jun 16 15:07:52 raid kernel: LASTPHASE[0x1] SCSISEQ[0x12] SBLKCTL[0xa] SCSIRATE[0x0]
Jun 16 15:07:52 raid kernel: SEQCTL[0x10] SEQ_FLAGS[0xc0] SSTAT0[0x0] SSTAT1[0x8]
Jun 16 15:07:52 raid kernel: SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x8] SIMODE1[0xa4]
Jun 16 15:07:52 raid kernel: SXFRCTL0[0x88] DFCNTRL[0x0] DFSTATUS[0x89]
Jun 16 15:07:52 raid kernel: STACK: 0x34 0xd2 0x163 0x109
Jun 16 15:07:52 raid kernel: SCB count = 8
Jun 16 15:07:52 raid kernel: Kernel NEXTQSCB = 1
Jun 16 15:07:52 raid kernel: Card NEXTQSCB = 3
Jun 16 15:07:52 raid kernel: QINFIFO entries: 3
Jun 16 15:07:52 raid kernel: Waiting Queue entries:
Jun 16 15:07:52 raid kernel: Disconnected Queue entries:
Jun 16 15:07:52 raid kernel: QOUTFIFO entries:
Jun 16 15:07:52 raid kernel: Sequencer Free SCB List: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Jun 16 15:07:52 raid kernel: Sequencer SCB Info:
Jun 16 15:07:52 raid kernel:   0 SCB_CONTROL[0x60] SCB_SCSIID[0x97] SCB_LUN[0x0] SCB_TAG[0x0]
Jun 16 15:07:52 raid kernel:   1 SCB_CONTROL[0x60] SCB_SCSIID[0x97] SCB_LUN[0x0] SCB_TAG[0x3]

...

Jun 16 15:07:53 raid kernel: Pending list:
Jun 16 15:07:53 raid kernel:   0 SCB_CONTROL[0x60] SCB_SCSIID[0x97] SCB_LUN[0x0]
Jun 16 15:07:53 raid kernel:   3 SCB_CONTROL[0x74] SCB_SCSIID[0x97] SCB_LUN[0x0]
Jun 16 15:07:53 raid kernel: Kernel Free SCB list: 2 7 6 5 4
Jun 16 15:07:53 raid kernel: DevQ(0:9:0): 0 waiting
Jun 16 15:07:53 raid kernel: DevQ(0:10:0): 0 waiting
Jun 16 15:07:53 raid kernel:
Jun 16 15:07:53 raid kernel: <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
Jun 16 15:07:53 raid kernel: Recovery SCB completes
Jun 16 15:07:53 raid kernel: (scsi0:A:9:0): Device is disconnected, re-queuing SCB
Jun 16 15:07:53 raid kernel: Recovery code sleeping
Jun 16 15:07:57 raid kernel: Recovery code awake
Jun 16 15:07:57 raid kernel: Timer Expired
Jun 16 15:07:57 raid kernel: aic7xxx_abort returns 0x2003
Jun 16 15:07:57 raid kernel: scsi0:0:9:0: Attempting to queue an ABORT message
Jun 16 15:07:57 raid kernel: CDB: 0x28 0x0 0xe 0x20 0x8d 0xcf 0x0 0x0 0x20 0x0
Jun 16 15:07:57 raid kernel: scsi0:0:9:0: Command not found
Jun 16 15:07:57 raid kernel: aic7xxx_abort returns 0x2002
Jun 16 15:08:07 raid kernel: scsi0:0:9:0: Attempting to queue an ABORT message
Jun 16 15:08:07 raid kernel: CDB: 0x0 0x0 0x0 0x0 0x0 0x0
Jun 16 15:08:07 raid kernel: scsi0: At time of recovery, card was not paused
Jun 16 15:08:07 raid kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
Jun 16 15:08:07 raid kernel: scsi0: Dumping Card State while idle, at SEQADDR 0x16b
Jun 16 15:08:07 raid kernel: Card was paused
Jun 16 15:08:07 raid kernel: ACCUM = 0x7, SINDEX = 0x1, DINDEX = 0xe4, ARG_2 = 0x0
Jun 16 15:08:07 raid kernel: HCNT = 0x0 SCBPTR = 0x2
Jun 16 15:08:07 raid kernel: SCSIPHASE[0x0] SCSISIGI[0x0] ERROR[0x0] SCSIBUSL[0x0]
Jun 16 15:08:07 raid kernel: LASTPHASE[0x1] SCSISEQ[0x12] SBLKCTL[0xa] SCSIRATE[0x0]
Jun 16 15:08:07 raid kernel: SEQCTL[0x10] SEQ_FLAGS[0xc0] SSTAT0[0x0] SSTAT1[0x8]
Jun 16 15:08:07 raid kernel: SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x8] SIMODE1[0xa4]
Jun 16 15:08:07 raid kernel: SXFRCTL0[0x88] DFCNTRL[0x0] DFSTATUS[0x89]
Jun 16 15:08:07 raid kernel: STACK: 0x34 0xd2 0x163 0x109
Jun 16 15:08:07 raid kernel: SCB count = 8
Jun 16 15:08:07 raid kernel: Kernel NEXTQSCB = 3
Jun 16 15:08:07 raid kernel: Card NEXTQSCB = 0
Jun 16 15:08:07 raid kernel: QINFIFO entries: 0 1
Jun 16 15:08:07 raid kernel: Waiting Queue entries:
Jun 16 15:08:07 raid kernel: Disconnected Queue entries:
Jun 16 15:08:07 raid kernel: QOUTFIFO entries:
Jun 16 15:08:07 raid kernel: Sequencer Free SCB List: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Jun 16 15:08:07 raid kernel: Sequencer SCB Info:
Jun 16 15:08:07 raid kernel:   0 SCB_CONTROL[0x60] SCB_SCSIID[0x97] SCB_LUN[0x0] SCB_TAG[0x0]

....

위와 같이 주욱 뱉어 내더니.. 결국엔 아래와 같이 커널 패닉 메시지를 내고 시스템이 죽어 버리네요.

Kernel panic - not syncing: for safety
Badness in smp_call_function at arch/i386/kernel/smp.c:519
Stack pointer is garbage, not printing trace

관련자료

댓글 0
등록된 댓글이 없습니다.

공지사항


뉴스광장


  • 현재 회원수 :  60,074 명
  • 현재 강좌수 :  35,995 개
  • 현재 접속자 :  534 명