抱歉,您的浏览器无法访问本站

本页面需要浏览器支持(启用)JavaScript


了解详情 >

BlackBird的博客

这世界上所有的不利状况,都是当事者能力不足导致的

前言

前两天第五空间比赛,有一个pwn题目toolkit没有做出来,复现的时候发现是栈溢出,绕过canary。虽然之前一些canary的题一直有在做,但是源码什么的还没有调试过,就借此机会,把有关canary的内容总结一下。基础的泄露canary什么的就不细说了(同质化内容,我还是复制站贴,然后再稍微改一点,省的某天大佬们的博客没了,如果你嫌废话,直接跳转到最最后面的C++ Exception机制绕过),主要是调试一下canary相关源码,和C++的Exception机制绕过。

麻了。。。写完后才发现有人写过类似的了……算了,自己过一遍不算坏事,,就是费时间

Canary简介

Canary的意思是金丝雀,来源于英国矿井工人用来探查井下气体是否有毒的金丝雀笼子。工人们每次下井都会带上一只金丝雀。如果井下的气体有毒,金丝雀由于对毒性敏感就会停止鸣叫甚至死亡,从而使工人们得到预警。

我们类比回去,系统执行程序就是工人下井,栈溢出攻击就是井下毒气泄露,canary就是检测栈溢出攻击,来保护系统安全。

我们知道,通常栈溢出的利用方式是通过溢出存在于栈上的局部变量,从而让多出来的数据覆盖 saved rbpsaved retaddr 等,从而达到劫持控制流的目的。栈溢出保护是一种缓冲区溢出攻击缓解手段,当函数存在缓冲区溢出攻击漏洞时,攻击者可以覆盖栈上的返回地址来控制执行流。当启用栈保护后,函数开始执行的时候会先往栈底插入 cookie信息,当函数真正返回的时候会验证 cookie信息是否合法 (栈帧销毁前测试该值是否被改变),如果不合法就停止程序运行 (检测到栈溢出)。攻击者在覆盖返回地址的时候往往也会将 cookie 信息给覆盖掉,导致栈保护检查失败而停止运行,避免漏洞利用成功。在 Linux中我们将 cookie信息称为 Canary

Canary构建方式及原理

GCC使用Canary

可以在 GCC中使用以下参数设置 Canary:

-fstack-protector 启用保护,不过只为局部变量中含有数组的函数插入保护 -fstack-protector-all 启用保护,为所有函数插入保护 -fstack-protector-strong -fstack-protector-explicit 只对有明确 stack_protect attribute 的函数开启保护 -fno-stack-protector 禁用保护

Canary示例

#include<stdio.h>
#include<stdlib.h>
void func(){
    char s[0x15];
    scanf("%s", s);
}
int main(){
    func();
}
//gcc ./test.c -o -g3 test_can
//gcc -fno-stack-protector ./test.c -o -g3 test

image-20220923145020402

当程序运行起来以后:

    Low 
    Address |                 |
            +-----------------+
            | 局部变量         |
            +-----------------+
  rbp-8 =>  | canary value    |
            +-----------------+
    rbp =>  | old ebp         |
            +-----------------+
            | return address  |
            +-----------------+
            |                 |

上面多出来的汇编代码很简单,就是再调用栈初始化阶段将fs:0x28存储到rbp-8的位置,然后在调用栈结束阶段将rbp-8fs:0x28进行比较,如果不一致则执行__stack_chk_fail函数:

void __attribute__ ((noreturn)) __stack_chk_fail (void)
{
  __fortify_fail ("stack smashing detected");
}

//glibc2.23 
void __attribute__ ((noreturn)) internal_function __fortify_fail (const char *msg)
{
  /* The loop is added only to keep gcc happy.  */
  while (1)
    __libc_message (2, "*** %s ***: %s terminated\n",
                    msg, __libc_argv[0] ?: "<unknown>");
}

// glibc2.27
void __attribute__ ((noreturn)) __fortify_fail (const char *msg)
{
  __fortify_fail_abort (true, msg);
}

void __attribute__ ((noreturn)) __fortify_fail_abort (_Bool need_backtrace, const char *msg)
{
  /* The loop is added only to keep gcc happy.  Don't pass down
     __libc_argv[0] if we aren't doing backtrace since __libc_argv[0]
     may point to the corrupted stack.  */
  while (1)
    __libc_message (need_backtrace ? (do_abort | do_backtrace) : do_abort,
                    "*** %s ***: %s terminated\n",
                    msg,
                    (need_backtrace && __libc_argv[0] != NULL
                     ? __libc_argv[0] : "<unknown>"));
}

// glibc2.31及以上
void __attribute__ ((noreturn)) __fortify_fail (const char *msg)
{
  /* The loop is added only to keep gcc happy.  */
  while (1)
    __libc_message (do_abort, "*** %s ***: terminated\n", msg);
}

Canary源码分析

因为Canary的性质与功能我们不难想到Canary检测的相关代码是在编译过程添加的,Canary的值是在main函数之前初始化的,通过查询资料是在security_init函数中初始化的。同时我们知道程序加载过程是**_start -> libc_start_main -> libc_csu_init -> _init -> main -> _fini**,因此Canary的值应该是在libc_csu_init函数中初始化的。怪,断点到libc_start_main函数,canary的值已经被初始化了,那就是再向上一层———_start函数,但是在_start函数中,Canary也已经被初始化过了……怪,程序难道不是从_start开始的么???

又查了一下资料——linux编程之main()函数启动过程,发现我记得没有错啊……

最后才发现,程序在加载时glibc中的ld.so会先初始化TLS,而Canary就是在这个环节初始化的,对应调用链是**_start->_dl_start->_dl_start_final->_dl_sysdep_start->dl_main->security_init**:

// rtld.c
static void
security_init (void)
{
  /* Set up the stack checker's canary.  */
  uintptr_t stack_chk_guard = _dl_setup_stack_chk_guard (_dl_random);
#ifdef THREAD_SET_STACK_GUARD
  THREAD_SET_STACK_GUARD (stack_chk_guard);
#else
  __stack_chk_guard = stack_chk_guard;
#endif

  /* Set up the pointer guard as well, if necessary.  */
  uintptr_t pointer_chk_guard
    = _dl_setup_pointer_guard (_dl_random, stack_chk_guard);
#ifdef THREAD_SET_POINTER_GUARD
  THREAD_SET_POINTER_GUARD (pointer_chk_guard);
#endif
  __pointer_chk_guard_local = pointer_chk_guard;

  /* We do not need the _dl_random value anymore.  The less
     information we leave behind, the better, so clear the
     variable.  */
  _dl_random = NULL;
}

static inline uintptr_t __attribute__ ((always_inline))
_dl_setup_stack_chk_guard (void *dl_random)
{
  union
  {
    uintptr_t num;
    unsigned char bytes[sizeof (uintptr_t)];
  } ret = { 0 };

  if (dl_random == NULL)
    {
      ret.bytes[sizeof (ret) - 1] = 255;
      ret.bytes[sizeof (ret) - 2] = '\n';
    }
  else
    {
      memcpy (ret.bytes, dl_random, sizeof (ret));
#if BYTE_ORDER == LITTLE_ENDIAN
      ret.num &= ~(uintptr_t) 0xff;
#elif BYTE_ORDER == BIG_ENDIAN
      ret.num &= ~((uintptr_t) 0xff << (8 * (sizeof (ret) - 1)));
#else
# error "BYTE_ORDER unknown"
#endif
    }
  return ret.num;
}
#define THREAD_SET_STACK_GUARD(value) \
  THREAD_SETMEM (THREAD_SELF, header.stack_guard, value)
/* Set member of the thread descriptor directly.  */
# define THREAD_SETMEM(descr, member, value) \
  ({ if (sizeof (descr->member) == 1)					      \
       asm volatile ("movb %b0,%%gs:%P1" :				      \
		     : "iq" (value),					      \
		       "i" (offsetof (struct pthread, member)));	      \
     else if (sizeof (descr->member) == 4)				      \
       asm volatile ("movl %0,%%gs:%P1" :				      \
		     : "ir" (value),					      \
		       "i" (offsetof (struct pthread, member)));	      \
     else								      \
       {								      \
	 if (sizeof (descr->member) != 8)				      \
	   /* There should not be any value with a size other than 1,	      \
	      4 or 8.  */						      \
	   abort ();							      \
									      \
	 asm volatile ("movl %%eax,%%gs:%P1\n\t"			      \
		       "movl %%edx,%%gs:%P2" :				      \
		       : "A" ((uint64_t) cast_to_integer (value)),	      \
			 "i" (offsetof (struct pthread, member)),	      \
			 "i" (offsetof (struct pthread, member) + 4));	      \
       }})

我们通过调试可以清晰看到,Canary是在_dl_setup_stack_chk_guard函数中由_dl_random生成,在THREAD_SET_STACK_GUARD宏中赋值给fs:0x28。这里需要注意一点,_dl_setup_stack_chk_guard函数有inline关键字,且THREAD_SET_STACK_GUARD是一个套壳宏,所以在调试的过程中两个都跟踪不进去,只能调试到security_init函数:

image-20220923171918446

除了security_init函数,在__libc_start_main函数中也可以生成Canary

LIBC_START_MAIN (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
		 int argc, char **argv,
#ifdef LIBC_START_MAIN_AUXVEC_ARG
		 ElfW(auxv_t) *auxvec,
#endif
		 __typeof (main) init,
		 void (*fini) (void),
		 void (*rtld_fini) (void), void *stack_end)
{
  /*......*/
    /* Set up the stack checker's canary.  */
  uintptr_t stack_chk_guard = _dl_setup_stack_chk_guard (_dl_random);
# ifdef THREAD_SET_STACK_GUARD
  THREAD_SET_STACK_GUARD (stack_chk_guard);
# else
  __stack_chk_guard = stack_chk_guard;
# endif

# ifdef DL_SYSDEP_OSCHECK
  if (!__libc_multiple_libcs)
    {
      /* This needs to run to initiliaze _dl_osversion before TLS
	 setup might check it.  */
      DL_SYSDEP_OSCHECK (__libc_fatal);
    }
# endif

  /* Initialize libpthread if linked in.  */
  if (__pthread_initialize_minimal != NULL)
    __pthread_initialize_minimal ();

  /* Set up the pointer guard value.  */
  uintptr_t pointer_chk_guard = _dl_setup_pointer_guard (_dl_random,
							 stack_chk_guard);
  /*......*/
}

现在找到Canary初始化的地方,我们再看一下$fs寄存器,对于 Linux来说,$fs寄存器实际指向的是当前栈的 TLS结构,而$fs:0x28 指向的正是 stack_guard

//tls.h
typedef struct
{
  void *tcb;		/* Pointer to the TCB.  Not necessarily the
			   thread descriptor used by libpthread.  */
  dtv_t *dtv;
  void *self;		/* Pointer to the thread descriptor.  */
  int multiple_threads;
  uintptr_t sysinfo;
  uintptr_t stack_guard;
  uintptr_t pointer_guard;
  int gscope_flag;
  /* Bit 0: X86_FEATURE_1_IBT.
     Bit 1: X86_FEATURE_1_SHSTK.
   */
  unsigned int feature_1;
  /* Reservation of some values for the TM ABI.  */
  void *__private_tm[3];
  /* GCC split stack support.  */
  void *__private_ss;
  /* The lowest address of shadow stack,  */
  unsigned long ssp_base;
} tcbhead_t;

我们退到secuiry_init函数的上一层———_dl_main:

// rtld.c
static void
dl_main (const ElfW(Phdr) *phdr, ElfW(Word) phnum, ElfW(Addr) *user_entry, ElfW(auxv_t) *auxv)
{
  /*......*/
  void *tcbp = NULL;
  /*......*/
  bool need_security_init = true;
  if (__glibc_unlikely (audit_list != NULL)
      || __glibc_unlikely (audit_list_string != NULL))
    {
      /* Since we start using the auditing DSOs right away we need to
	 initialize the data structures now.  */
      tcbp = init_tls ();

      /* Initialize security features.  We need to do it this early
	 since otherwise the constructors of the audit libraries will
	 use different values (especially the pointer guard) and will
	 fail later on.  */
      security_init ();
      need_security_init = false;

      load_audit_modules (main_map);
    }
  /*......*/
}

//dl-tls.c
static void * init_tls (void)
{
  /* Number of elements in the static TLS block.  */
  GL(dl_tls_static_nelem) = GL(dl_tls_max_dtv_idx);

  /* Do not do this twice.  The audit interface might have required
     the DTV interfaces to be set up early.  */
  if (GL(dl_initial_dtv) != NULL)
    return NULL;

  /* Allocate the array which contains the information about the
     dtv slots.  We allocate a few entries more than needed to
     avoid the need for reallocation.  */
  size_t nelem = GL(dl_tls_max_dtv_idx) + 1 + TLS_SLOTINFO_SURPLUS;

  /* Allocate.  */
  GL(dl_tls_dtv_slotinfo_list) = (struct dtv_slotinfo_list *)
    calloc (sizeof (struct dtv_slotinfo_list)
	    + nelem * sizeof (struct dtv_slotinfo), 1);
  /* No need to check the return value.  If memory allocation failed
     the program would have been terminated.  */

  struct dtv_slotinfo *slotinfo = GL(dl_tls_dtv_slotinfo_list)->slotinfo;
  GL(dl_tls_dtv_slotinfo_list)->len = nelem;
  GL(dl_tls_dtv_slotinfo_list)->next = NULL;

  /* Fill in the information from the loaded modules.  No namespace
     but the base one can be filled at this time.  */
  assert (GL(dl_ns)[LM_ID_BASE + 1]._ns_loaded == NULL);
  int i = 0;
  for (struct link_map *l = GL(dl_ns)[LM_ID_BASE]._ns_loaded; l != NULL;
       l = l->l_next)
    if (l->l_tls_blocksize != 0)
      {
	/* This is a module with TLS data.  Store the map reference.
	   The generation counter is zero.  */
	slotinfo[i].map = l;
	/* slotinfo[i].gen = 0; */
	++i;
      }
  assert (i == GL(dl_tls_max_dtv_idx));

  /* Compute the TLS offsets for the various blocks.  */
  _dl_determine_tlsoffset ();

  /* Construct the static TLS block and the dtv for the initial
     thread.  For some platforms this will include allocating memory
     for the thread descriptor.  The memory for the TLS block will
     never be freed.  It should be allocated accordingly.  The dtv
     array can be changed if dynamic loading requires it.  */
  void *tcbp = _dl_allocate_tls_storage ();
  if (tcbp == NULL)
    _dl_fatal_printf ("\
cannot allocate TLS data structures for initial thread\n");

  /* Store for detection of the special case by __tls_get_addr
     so it knows not to pass this dtv to the normal realloc.  */
  GL(dl_initial_dtv) = GET_DTV (tcbp);

  /* And finally install it for the main thread.  */
  const char *lossage = TLS_INIT_TP (tcbp);
  if (__glibc_unlikely (lossage != NULL))
    _dl_fatal_printf ("cannot set up thread-local storage: %s\n", lossage);
  tls_init_tp_called = true;

  return tcbp;
}

可以看到tcb结构体的定义过程。

Canary相关攻击方法

__stack_chk_fail函数

Canary被篡改,程序便会执行__stack_chk_fail函数。

劫持got表

我们可以通过劫持该函数的got表,达到控制执行流的目的。

例题:ZCTF2017 Login

Stack Smash | SSP(Stack Smashing Protector ) leak

此时程序就会报错,我们看一下__stack_chk_fail函数的源码:

//glibc2.23 
void __attribute__ ((noreturn)) internal_function __fortify_fail (const char *msg)
{
  /* The loop is added only to keep gcc happy.  */
  while (1)
    __libc_message (2, "*** %s ***: %s terminated\n",
                    msg, __libc_argv[0] ?: "<unknown>");
}

// glibc2.27
void __attribute__ ((noreturn)) __fortify_fail (const char *msg)
{
  __fortify_fail_abort (true, msg);
}

void __attribute__ ((noreturn)) __fortify_fail_abort (_Bool need_backtrace, const char *msg)
{
  /* The loop is added only to keep gcc happy.  Don't pass down
     __libc_argv[0] if we aren't doing backtrace since __libc_argv[0]
     may point to the corrupted stack.  */
  while (1)
    __libc_message (need_backtrace ? (do_abort | do_backtrace) : do_abort,
                    "*** %s ***: %s terminated\n",
                    msg,
                    (need_backtrace && __libc_argv[0] != NULL
                     ? __libc_argv[0] : "<unknown>"));
}

// glibc2.31及以上
void __attribute__ ((noreturn)) __fortify_fail (const char *msg)
{
  /* The loop is added only to keep gcc happy.  */
  while (1)
    __libc_message (do_abort, "*** %s ***: terminated\n", msg);
}

glibc2.23glibc2.27中(glibc2.31及以上,__fortify_fail函数不再输出__libc_argv[0]指针),该函数会输出__libc_argv[0]指针所指向的字符串。正常情况下,这个指针指向了程序名,储存在栈上。当我们栈溢出空间特别大时,我们可以覆盖__libc_argv[0]指针从而输出我们需要的信息。

  • 例题:32C3CTF readme34C3CTF readme-revenge

逐字节爆破

对于 Canary,虽然每次进程重启后的 Canary不同,但是同一个进程中的不同线程的 Canary是相同的,并且通过fork函数创建的子进程的Canary也是相同的,因为fork函数会直接拷贝父进程的内存。我们可以利用这样的特点,彻底逐个字节将Canary爆破出来。

零字节覆盖

因为Canary在设计的时候,低字节(高地址)处为空字符用于阶段字符串与局部变量区别开来。所以我们可以先把这个空字符覆盖,然后输出局部变量,便可以连带的输出Canary值。

TLS结构体

因为Canary的值存储在TLS结构体内,所以我们可以通过泄露tcb_header结构体的stack_guard字段获取Canary,或者修改tcb_header结构体的stack_guard字段自定义Canary

C++ Exception机制绕过

C++ Exception处理的过程具体可以参考☁☁的博客

我们需要知道的是,当某个函数throw一个exception,程序便会从当前函数开始向上回溯调用链,直到找到匹配的catch,执行完catch后,便会接着在catch所在函数继续执行。一开始我还以为是什么非常nb的打法,但其实很简单,就是两点:

  1. throw错误的函数回溯时不会对canary进行检查
  2. trycatch关键词的函数默认无canary(除非加上-fstack-protector-all编译选项)

“含trycatch关键词的函数默认无canary”且溢出长度足够,可以直接构造rop链(构造在catch所在函数结束后);如果“含try,catch关键词的函数默认无canary”且溢出长度不够,可以进行栈迁移(构造在catch所在函数结束后)

重点要观察catch所在函数,及该函数是否有canary

例题:Shanghai-DCTF-2017 线下攻防Pwn题第五空间CTF的toolkit学到的两个利用技巧

参考链接

《CTF竞赛权威指南——PWN篇》p58 4.2 Stack Canary

CTFwiki-Canary

GCC SSP Canary功能简介 | 御风而行 (donald-zhuang.github.io)

linux进程分析之旅

C++异常处理

canary的绕过姿势

评论